All our algorithms and analysis was built using a stable version 4.3.0 of JAGS, stable version 0.10.1 of NIMBLE, and R version 3.6.3. We are also using the packages rjags (Denwood 2016) and nimble (Valpine et al. 2020), to build models, coda (Plummer et al. 2009) and MCMCvis (Youngflesh 2018) to extract summary information from MCMC output; AlphaSimR (Gaynor, Gorjanc, and Hickey 2020) to make animal breeding programme simulation; MatrixModels(Bates and Maechler 2015); and pedigreemm(Bates and Vazquez 2014); ggplot2 (Wickham 2016), patchwork (Pedersen 2020), and ggrepel (Slowikowski 2020) to prepare graphical outputs; dplyr (Wickham et al. 2020), knitr (Xie 2020), kableExtra(Zhu 2020), and formattable (Ren and Russell 2016) for general purposes.

sessionInfo()
## R version 3.6.3 (2020-02-29)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.2 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
## 
## locale:
##  [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
##  [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
##  [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## loaded via a namespace (and not attached):
##  [1] compiler_3.6.3    magrittr_2.0.1    bookdown_0.21     tools_3.6.3      
##  [5] htmltools_0.5.1.1 yaml_2.2.1        stringi_1.5.3     rmarkdown_2.6    
##  [9] knitr_1.31        stringr_1.4.0     xfun_0.20         digest_0.6.27    
## [13] rlang_0.4.10      evaluate_0.14

1 Brief description of animal model

Animal models have been used to gain some knowledge about genetic and environmental parameters, and generally, the phenotype values are a function of fixed and random effects. This approach is so-called mixed-effects models and has been used for many decades in different research fields. However, the random effects and their associated variance components have a special meaning when working in the genetics field, and avoiding complexity will gently explain those ideas and interpretation. Suppose a straightforward experiment where we measured response variable \(y_{i}\), called phenotype, on the \(i\)th individual. Some of them have one or both parents (un)known, and some animals belong to a group factor with two levels (1 or 2), as we show in the following database:

Table 1.1: Example data set for animal model
individual father mother phenotype group
1 NA NA NA NA
2 NA NA 105 1
3 2 1 98 1
4 2 NA 101 2
5 4 3 106 2
6 2 3 93 2
7 5 6 NA NA
8 NA NA NA NA
9 7 8 109 1

The simplest way to capture the expected relationship between \(y_i\) and some covariates is through the model:

\[\begin{align} \begin{array}{c} y_{ij} = b_{j} + a_{i} + e_{ij}, \\ b_{j} = m + Gr_{j}, \quad a_{i} \sim \mbox{N}\left(0, \sigma_{a}^{2}\right), \quad e_{ij} \sim \mbox{N}\left(0,\sigma^2_{e}\right) \end{array} \label{eq:basic_model} \end{align}\]

where \(y_{ij}\) is the phenotypic value of individual \(i\) from group \(j\), \(b_{j}\) is the mean for the group \(j\), \(a_{i}\) is an additive genetic value of individual \(i\) (frequently called breeding value), \(m\) is the overall mean, \(Gr_j\) is the effect of \(j\)th group, and \(e_{ij}\) is the (experimental) error term. This model can be easily extended to account for a more complex experiment, and it can be generalized through matrix notation as:

\[\begin{eqnarray} \mathbf{y} & = & \mathbf{X}\mathbf{b}+\mathbf{Z}\mathbf{a}+\mathbf{e},\label{eq:animal_model} \end{eqnarray}\]

where design matrices (\(\mathbf{X}\) and \(\mathbf{Z}\)) link the unknown parameters \(\mathbf{b}\) and predicted values at population-level \(\mathbf{a}\) with phenotypic values \((\mathbf{y})\). Standard assumptions for phenotypes following normal (Gaussian) distribution are \(\mathbf{e}|\sigma_{e}^{2} \sim \mbox{N}\left(\mathbf{0},\sigma_{e}^{2}\mathbf{I}\right)\), \(\mathbf{y}|\mathbf{a} \sim \mbox{N}\left(\mathbf{X}\mathbf{b}+\mathbf{Z}\mathbf{a}, \sigma_{e}^{2}\mathbf{\mathbf{I}}\right)\), and marginally \(\mathbf{y} \sim \mbox{N}\left(\mathbf{X}\mathbf{b}, \mathbf{Z}\mathbf{G}\mathbf{Z}^{T}+\sigma_{e}^{2}\mathbf{\mathbf{I}}\right)\), where \(\mathbf{G}\) represents a general variance-covariance matrix for \(\mathbf{a}\). For now, let us assume \(\mathbf{G} = \sigma_{a}^{2}\mathbf{A}\), where \(\mathbf{A}\) is a numerator relationship matrix (NRM) used to account for the additive genetic covariance between records of related animals, and \(\sigma_{a}^{2}\) is the additive genetic variance. Consequently, \(\mathbf{a}|\mathbf{A},\sigma_{a}^{2} \sim \mbox{N}\left(\mathbf{0},\mathbf{A}\sigma_{a}^{2}\right)\). In addition, matrix \(\mathbf{A}\) can be decomposed as \(\mathbf{T}\mathbf{W}\mathbf{T}^{T}\), where \(\mathbf{T}\) is a dense unit lower triangular matrix describing the flow (trace) of genes in the pedigree; \(\mathbf{W}\) is a diagonal matrix with values depending on the knowledge of parents and parent inbreeding coefficients.

Before we begin the specification of this model in R, let us check how the design matrices \(\mathbf{X}\) and \(\mathbf{Z}\), and matrices \(\mathbf{A}\), \(\mathbf{T}\), and \(\mathbf{W}\) should be specified based on our example:

  • Matrix \(\mathbf{X}\) associated with fixed effects:
(X <- model.Matrix(~ group, data=example))
## 6 x 2 Matrix of class "ddenseModelMatrix"
##   (Intercept) group2
## 2           1      0
## 3           1      0
## 4           1      1
## 5           1      1
## 6           1      1
## 9           1      0
  • Matrix \(\mathbf{Z}\) associated with random effects:
Ind <- as.factor(example$individual)
(Z <- model.Matrix(~ Ind - 1, data = example, sparse=TRUE))
## "dsparseModelMatrix": 9 x 9 sparse Matrix of class "dgCMatrix"
##   Ind1 Ind2 Ind3 Ind4 Ind5 Ind6 Ind7 Ind8 Ind9
## 1    1    .    .    .    .    .    .    .    .
## 2    .    1    .    .    .    .    .    .    .
## 3    .    .    1    .    .    .    .    .    .
## 4    .    .    .    1    .    .    .    .    .
## 5    .    .    .    .    1    .    .    .    .
## 6    .    .    .    .    .    1    .    .    .
## 7    .    .    .    .    .    .    1    .    .
## 8    .    .    .    .    .    .    .    1    .
## 9    .    .    .    .    .    .    .    .    1
## @ assign:  1 1 1 1 1 1 1 1 1 
## @ contrasts:
## $Ind
## [1] "contr.treatment"
  • Matrix \(\mathbf{A}\):
MASS::fractions(as.matrix(A))
##       [,1]  [,2]  [,3]  [,4]  [,5]  [,6]  [,7]  [,8]  [,9] 
##  [1,]     1     0   1/2     0   1/4   1/4   1/4     0   1/8
##  [2,]     0     1   1/2   1/2   1/2   3/4   5/8     0  5/16
##  [3,]   1/2   1/2     1   1/4   5/8   3/4 11/16     0 11/32
##  [4,]     0   1/2   1/4     1   5/8   3/8   1/2     0   1/4
##  [5,]   1/4   1/2   5/8   5/8   9/8  9/16 27/32     0 27/64
##  [6,]   1/4   3/4   3/4   3/8  9/16   5/4 29/32     0 29/64
##  [7,]   1/4   5/8 11/16   1/2 27/32 29/32 41/32     0 41/64
##  [8,]     0     0     0     0     0     0     0     1   1/2
##  [9,]   1/8  5/16 11/32   1/4 27/64 29/64 41/64   1/2     1
  • Matrix \(\mathbf{T}\) (lower triangular matrix):
Tm
## 9 x 9 sparse Matrix of class "dtCMatrix"
##                                                  
##  [1,] 1.000 .      .    .     .    .    .   .   .
##  [2,] .     1.0000 .    .     .    .    .   .   .
##  [3,] 0.500 0.5000 1.00 .     .    .    .   .   .
##  [4,] .     0.5000 .    1.000 .    .    .   .   .
##  [5,] 0.250 0.5000 0.50 0.500 1.00 .    .   .   .
##  [6,] 0.250 0.7500 0.50 .     .    1.00 .   .   .
##  [7,] 0.250 0.6250 0.50 0.250 0.50 0.50 1.0 .   .
##  [8,] .     .      .    .     .    .    .   1.0 .
##  [9,] 0.125 0.3125 0.25 0.125 0.25 0.25 0.5 0.5 1
  • \(\mathbf{W}\) (the variance and covariance matrix for Mendelian sampling):
(MASS::fractions(as.matrix(Wm)))
##       [,1]   [,2]   [,3]   [,4]   [,5]   [,6]   [,7]   [,8]   [,9]  
##  [1,]      1      0      0      0      0      0      0      0      0
##  [2,]      0      1      0      0      0      0      0      0      0
##  [3,]      0      0    1/2      0      0      0      0      0      0
##  [4,]      0      0      0    3/4      0      0      0      0      0
##  [5,]      0      0      0      0    1/2      0      0      0      0
##  [6,]      0      0      0      0      0    1/2      0      0      0
##  [7,]      0      0      0      0      0      0  13/32      0      0
##  [8,]      0      0      0      0      0      0      0      1      0
##  [9,]      0      0      0      0      0      0      0      0 55/128
  • Vector of the diagonal for matrix \(\mathbf{W}^{-1}\):
diag(WInv <- Diagonal(x=1/Dmat(pedigree1)))
## [1] 1.000000 1.000000 2.000000 1.333333 2.000000 2.000000 2.461538 1.000000
## [9] 2.327273

2 BUGS language and JAGS implementation

BUGS users do not need to know how to construct the full conditional distributions and choose sampling algorithms. Users have to supply model, data, and possibly some initial values, while the BUGS program itself does the rest. The key part and difference to other programs is the model description. BUGS model language is described in (Spiegelhalter et al. 2003, 2007). Here, the two most important operators are described, namely ~ and <-:

  • Operator (~) describes the stochastic relationship, e.g., a ~ dnorm(mu, tau) states that variable a is distributed according to the normal distribution with specified mean and variance.
  • Operator (<-) describes the deterministic relationship, e.g., mu <- a + b states that variable mu is a sum of two components.

Finally, everything that follows the character # is considered a comment and is not evaluated by BUGS or JAGS packages.

The animal model code proposed here is a direct description of DAG using BUGS language. The code below is similar to the one presented by Waldmann (2009a), but it is more compact and generic.

# Animal model in BUGS language
AnimalModelBUGSVar <- function()
{
  ## Priors - precision parameters
  tau2e ~ dgamma(0.001, 0.001)
  tau2a ~ dgamma(0.001, 0.001)
  sigma2e <- pow(tau2e, -1)
  sigma2a <- pow(tau2a, -1)

  ## Additive genetic values
  for(k in 1:nI) {
    a[id[k]] ~ dnorm(pa[id[k]], Xtau2a[id[k]])
    pa[id[k]] <- 0.5 * (a[fid[k]] + a[mid[k]])
    Xtau2a[id[k]] <- winv[id[k]] * tau2a
  }
  a[nU] <- 0 # NULL (zero) holder

  ## Fixed Effects
  for(j in 1:nB) { b[j] ~ dnorm(0, 1.0E-6) }

  ## Phenotypes
  for(i in 1:nY) {
    y[i] ~ dnorm(mu[i], tau2e)
    mu[i] <- inprod(x[i, ], b[]) + a[idy[i]]
  }
}

Waldmanns’ proposal treats founders and non-founders separately, assumes that phenotypic data is available only for non-founders, neglects the inbreeding when calculating the coefficients of within-family segregation variance is not easy to extend. There are no such limitations in AnimalModelBUGSVar.

Required data for the proposed implementation is symbolically shown bellow:

tmp <- AnimalModelBUGSData(ped=example, intercept = FALSE)

# Pedigree 'Plate'
list(nI = tmp$nI, id = tmp$id, fid = tmp$fid, mid = tmp$mid,
     winv = tmp$winv, nU = tmp$nU)
## $nI
## [1] 9
## 
## $id
## [1] 1 2 3 4 5 6 7 8 9
## 
## $fid
## [1] 10 10  2  2  4  2  5 10  7
## 
## $mid
## [1] 10 10  1 10  3  3  6 10  8
## 
## $winv
## [1] 1.000000 1.000000 2.000000 1.333333 2.000000 2.000000 2.461538 1.000000
## [9] 2.327273
## 
## $nU
## [1] 10
# Phenotype 'plate'
list(nY = tmp$nY, idy = tmp$idy, y = tmp$y, nB = tmp$nB, x = tmp$x)
## $nY
## [1] 6
## 
## $idy
## [1] 2 3 4 5 6 9
## 
## $y
## [1] 105  98 101 106  93 109
## 
## $nB
## [1] 2
## 
## $x
##   group1 group2
## 1      1      0
## 2      1      0
## 3      0      1
## 4      0      1
## 5      0      1
## 6      1      0
## attr(,"assign")
## [1] 1 1
## attr(,"contrasts")
## attr(,"contrasts")$group
## [1] "contr.treatment"
# Variance
sigma2a <- 1/3
sigma2e <- 1

# Precision = 1 / variance
tmp$tau2a <- 1 / sigma2a
tmp$tau2e <- 1 / sigma2e

The length of variables differs, but this can be accommodated using the list structure described in the JAGS manual. Users have to provide pedigree in the form of three integer identification vectors which constitute family trios (id - individual, fid - father, and mid - mother). Additionally, a vector winv with the coefficients of within family segregation variance coefficients is required. The length of these vectors is \(nI\times1\), where \(nI\) is the number of all individuals in the pedigree. Identifications need not range between \(1\) and \(nI\), though this is commonly done. Even character names can be used for identifications in JAGS, but several other programs often rely on pedigree members’ integer coding.

All unknown parents must be presented with the identification number \(nU\), commonly \(nU=nI+1\). This corresponds to an additional dummy node (a[nU]) in the DAG. The addition of a dummy node enables the use of the same code for founders and non-founders and is mandatory because BUGS/JAGS does not allow ‘zero indexing’ (e.g. a[0]) as has been suggested by Damgaard (2007a). Variables regarding the phenotypes are:

  • the number of phenotypic records (nY), individual identifications (idy) which link phenotypes and pedigree as sketched by \(\mathbf{Z}_{i,k}\) arc.
  • phenotypic values (y)
  • model predictor(s) (x), which can be either a factor or covariate, and
  • the number of fixed effects (nB).

For JAGS, data (phenotypes, pedigree, and identifications) need not be sorted in chronological or any other order. However, other programs that work with pedigree data often rely on different ways of ordering the data.

Records with missing phenotypes are excluded, except for the pedigree part. However, JAGS has a notable feature regarding missing values. When the phenotype is missing, but other variables are not, then such a record can be added to the data (phenotype must be set to NA, and JAGS will automatically impute randomly generated value given the specified model. This feature can be used to create simulated data according to an animal model (e.g. Van Vleck (1994)). Given the pedigree, model description, and a full vector of missing phenotypic values (all set to NA), JAGS will forward the sample from the specified distributions from the top to the bottom of DAG, which will at the end produce a sample of phenotypic values. For the simulation, it is recommended to use the informative priors or preset values for some parameters to get moderate phenotypic values. Finally, if variances are assumed known, their values need to be added in the data (Table 1.1) and their prior specifications in the model must be removed (see AnimalModelBUGSVar).

Because JAGS does not allow the specification of improper distributions, developers suggested the use of conditionally conjugate gamma distribution for \(\tau^{2}\) (tau2) with small values for both parameters, \(p\left(\tau^{2}\right)\sim Gamma\left(0.001,0.001\right)\). This is equivalent to \(p\left(\sigma^{2}\right)\sim Inverse-Gamma\left(0.001,0.001\right)\). Waldmann (2009a) used the uniform priors for standard deviations (\(\sigma_{a}\) and \(\sigma_{e}\)) following the suggestion of Gelman (2006), who demonstrated that this prior seems to be more non-informative for the models with a small number of groups (individuals) or a small between-group variance. The latter scenario should be of concern for quantitative geneticists analysing traits with small genetic variability. Variance priors in AnimalModelBUGSVar use a conditionally conjugate gamma distribution so that BUGS/JAGS can use Gibbs sampling instead of other more time-consuming algorithms. However, this does not mean that gamma prior for precisions should be preferred to the uniform prior for standard deviations. Users are encouraged to perform sensitivity analysis with different priors and decide if priors affect their conclusions. Following the approach described here, users familiar with JAGS can easily conduct such analyses by changing a line or two in AnimalModelBUGSVar.

3 Animal models accounting or not for inbreeding

3.1 Example 2: Animal model ignoring inbreeding

# General JAGs definitions
#-----------------------------------------------------------------------
# Initialization
#-----------------------------------------------------------------------
initfunction <- function(chain) {
  nI <- 9
  nB <- 2
  nc <- 3
  stopifnot(chain %in% 1:nc)
  a <- b <- list()
  for(i in 1:nc){
    a[[i]] <- c(rnorm(nI,0,2),NA)
    b[[i]] <- rnorm(nB,100,10)
  }
  .RNG.name <- rep("base::Super-Duper", 3)[chain]
  .RNG.seed = seq(1,3,1)[chain]
  return(list(a = a[[chain]], b = b[[chain]], 
              .RNG.seed = .RNG.seed, .RNG.name = .RNG.name))
}

## runJags
nChains <- 3
nBurninSteps <- 30000
nThinSteps <- 15
nUseSteps <- 60000

# Number of fitted models to calculating computing cost 
nt <- 50
#-----------------------------------------------------------------------
## For repeatability model
#-----------------------------------------------------------------------    
## Hyperparameters - variance of additive genetic values and variance of
## residuals
sigma2a <- 1/3
sigma2e <- 1

# Heritability
(h2 <- sigma2a / (sigma2a + sigma2e)) ## 0.25
## [1] 0.25
#=======================================================================
# Animal Model ignoring inbreeding
#=======================================================================
## Prepair data - ignoring inbreeding 
tmp <- AnimalModelBUGSData(ped=example,  inbreeding = FALSE,
                            intercept = FALSE)

## Set precisions
tmp$tau2a <- 1 / sigma2a
tmp$tau2e <- 1 / sigma2e
tmp
## $nI
## [1] 9
## 
## $nY
## [1] 6
## 
## $y
## [1] 105  98 101 106  93 109
## 
## $nU
## [1] 10
## 
## $id
## [1] 1 2 3 4 5 6 7 8 9
## 
## $fid
## [1] 10 10  2  2  4  2  5 10  7
## 
## $mid
## [1] 10 10  1 10  3  3  6 10  8
## 
## $winv
## [1] 1.000000 1.000000 2.000000 1.333333 2.000000 2.000000 2.000000 1.000000
## [9] 2.000000
## 
## $idy
## [1] 2 3 4 5 6 9
## 
## $x
##   group1 group2
## 1      1      0
## 2      1      0
## 3      0      1
## 4      0      1
## 5      0      1
## 6      1      0
## attr(,"assign")
## [1] 1 1
## attr(,"contrasts")
## attr(,"contrasts")$group
## [1] "contr.treatment"
## 
## 
## $nB
## [1] 2
## 
## $tau2a
## [1] 3
## 
## $tau2e
## [1] 1

3.1.1 JAGs model

AnimalModelBUGS <- function()
{
  ## Additive genetic values
  for(k in 1:nI) {
    a[id[k]] ~ dnorm(pa[id[k]], Xtau2a[id[k]])
    pa[id[k]] <- 0.5 * (a[fid[k]] + a[mid[k]])
    Xtau2a[id[k]] <- winv[id[k]] * tau2a
  }
  a[nU] <- 0 # NULL (zero) holder

  ## Location priors
  for(j in 1:nB) { b[j] ~ dnorm(0, 1.0E-6) }

  ## Phenotypes
  for(i in 1:nY) {
    y[i] ~ dnorm(mu[i], tau2e)
    mu[i] <- inprod(x[i, ], b[]) + a[idy[i]]
  }
}

Some comments:

  • x[i, ] is the \(i\)th line of matrix \(\mathbf{X}\);
  • b[] is the vector of fixed effects;
  • a[id[k]] is the additive genetic (\(a_{i} \sim \mbox{N}\left(0, \sigma_{a}^{2}\right)\));
  • pa[id[k]] is the expected value based on parents additive genetic (\(1/2\left(a_{father(i)}+ a_{mother(i)}\right)\));
  • winv[id[k]] is the within family segregation variance coefficients;
  • tau2a is the precision parameter, which is the reciprocal of the genetic variance (\(\sigma^2_{a}=1/\tau^2_a\));
  • tau2e is the precision parameter for the experimental error;
## Writing the JAGS model
filename <- "model.txt"
writeJags(model = AnimalModelBUGS, filename = filename)
#-----------------------------------------------------------------------
# Jags
#-----------------------------------------------------------------------
## runJags
jags_params <- c("a", "b")
runJagsOut <- run.jags(method = "parallel",
                       model = filename,
                       monitor = jags_params,
                       data = tmp,
                       n.chains = nChains,
                       burnin = nBurninSteps,
                       sample = ceiling(nUseSteps/nChains),
                       thin = nThinSteps,
                       summarise = FALSE,
                       plots = FALSE,
                       inits = initfunction)
## Calling 3 simulations using the parallel method...
## Following the progress of chain 1 (the program will wait for all chains
## to finish before continuing):
## Welcome to JAGS 4.3.0 on Tue Mar 16 17:01:16 2021
## JAGS is free software and comes with ABSOLUTELY NO WARRANTY
## Loading module: basemod: ok
## Loading module: bugs: ok
## . . Reading data file data.txt
## . Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
## Graph information:
##    Observed stochastic nodes: 6
##    Unobserved stochastic nodes: 11
##    Total graph size: 112
## . Reading parameter file inits1.txt
## . Initializing model
## . Adaptation skipped: model is not in adaptive mode.
## . Updating 30000
## -------------------------------------------------| 30000
## ************************************************** 100%
## . . . Updating 300000
## -------------------------------------------------| 300000
## ************************************************** 100%
## . . . . Updating 0
## . Deleting model
## . 
## All chains have finished
## Note: the model did not require adaptation
## Simulation complete.  Reading coda files...
## Coda files loaded successfully
## Finished running the simulation
# coda samples - MCMC
coda_samples <- as.mcmc.list(runJagsOut)

# parameter estimates
estimatesNoInbreeding <- MCMCsummary(coda_samples, round = 4)
estimatesNoInbreeding

3.2 Example 2: Animal model accounting for inbreeding

## Prepair data
tmp <- AnimalModelBUGSData(ped=example, intercept = FALSE)

## Set precisions
tmp$tau2a <- 1 / sigma2a
tmp$tau2e <- 1 / sigma2e

tmp
## $nI
## [1] 9
## 
## $nY
## [1] 6
## 
## $y
## [1] 105  98 101 106  93 109
## 
## $nU
## [1] 10
## 
## $id
## [1] 1 2 3 4 5 6 7 8 9
## 
## $fid
## [1] 10 10  2  2  4  2  5 10  7
## 
## $mid
## [1] 10 10  1 10  3  3  6 10  8
## 
## $winv
## [1] 1.000000 1.000000 2.000000 1.333333 2.000000 2.000000 2.461538 1.000000
## [9] 2.327273
## 
## $idy
## [1] 2 3 4 5 6 9
## 
## $x
##   group1 group2
## 1      1      0
## 2      1      0
## 3      0      1
## 4      0      1
## 5      0      1
## 6      1      0
## attr(,"assign")
## [1] 1 1
## attr(,"contrasts")
## attr(,"contrasts")$group
## [1] "contr.treatment"
## 
## 
## $nB
## [1] 2
## 
## $tau2a
## [1] 3
## 
## $tau2e
## [1] 1
#=======================================================================
##  Animal model known variance
#=======================================================================
AnimalModelBUGS <- function()
{
  ## Additive genetic values
  for(k in 1:nI) {
    a[id[k]] ~ dnorm(pa[id[k]], Xtau2a[id[k]])
    pa[id[k]] <- 0.5 * (a[fid[k]] + a[mid[k]])
    Xtau2a[id[k]] <- winv[id[k]] * tau2a
  }
  a[nU] <- 0 # NULL (zero) holder

  ## Location priors
  for(j in 1:nB) { b[j] ~ dnorm(0, 1.0E-6) }

  ## Phenotypes
  for(i in 1:nY) {
    y[i] ~ dnorm(mu[i], tau2e)
    mu[i] <- inprod(x[i, ], b[]) + a[idy[i]]
  }
}

3.2.1 JAGs model

## Structuring the dataset
tmp <- AnimalModelBUGSData(ped=example, intercept = FALSE)

## Set precisions
tmp$tau2a <- 1 / sigma2a
tmp$tau2e <- 1 / sigma2e

print(tmp)

## Writing the JAGS model
filename <- "model.txt"
writeJags(model = AnimalModelBUGS, filename = filename)
#=======================================================================
# Jags
#=======================================================================
jags_params <- c("a", "b")
runJagsOut <- run.jags(method = "parallel",
                       model = filename,
                       monitor = jags_params,
                       data = tmp,
                       n.chains = nChains,
                       burnin = nBurninSteps,
                       sample = ceiling(nUseSteps/nChains),
                       thin = nThinSteps,
                       summarise = FALSE,
                       plots = FALSE,
                        inits = initfunction)
# coda samples - MCMC
coda_samples <- as.mcmc.list(runJagsOut)

3.2.2 Parameter estimates

# Estimates
estimatesInbreeding <- MCMCsummary(coda_samples, round = 4)
estimatesInbreeding

3.3 Camparing both animal models (with and without inbreeding)

Waldmann (2009a) following Lin (1999) did not account for inbreeding in the calculation of within family segregation (Mendelian) variance coefficients \(\left(\mathbf{W}\right)\); they both used value 0.5 for all non-base individuals. The effect of neglecting inbreeding on the inference of \(\mathbf{b}\) and \(\mathbf{a}\) for model (Animal Model) and the example data set (Table 1.1) is not large, but is observable (Table 3.1 and Figure 3.1). Some general-purpose animal model programs do not account for inbreeding by default. The amount of inbreeding is often rather low, and the computation of inbreeding coefficients might be difficult for large populations. However, this is not optimal for each scenario.

#-----------------------------------------------------------------------

#-----------------------------------------------------------------------
data_comp <- data.frame(NoInb = estimatesNoInbreeding[c(1:9), 1],
                        Inb = estimatesInbreeding[c(1:9), 1],
                        Animal = gl(n = 9,  k = 1,
                                    labels = paste0("Animal ", 1:9)))
data_comp %>%
  ggplot(aes(y = NoInb,  x = Inb,  label = Animal)) +
  geom_point(shape = 1,  alpha = 0.9) +
  geom_abline(intercept = 0,  slope = 1) +
  geom_text_repel(
    nudge_y      = 0.05,
    direction    = "x",
    vjust        = 0,
    segment.size = 0.2) +
  ylab("Estimates considering no inbreeding") +
  xlab("Estimates considering inbreeding") +
  theme_bw(base_size = 12)
A

Figure 3.1: A

data_diff <- data.frame(
  Parameter = rownames(estimatesInbreeding)[c(1:9, 11:12)],
  Difference = estimatesNoInbreeding[c(1:9,  11, 12), 1] -
               estimatesInbreeding[c(1:9,  11, 12), 1],
  Ratio = estimatesNoInbreeding[c(1:9,  11, 12), 1]/
          estimatesInbreeding[c(1:9,  11, 12), 1])
Table 3.1: Difference and ratio between parameter estimates
Parameter Difference Ratio
a[1] 0.0026 0.9963
a[2] 0.0091 0.9821
a[3] 0.0085 0.9935
a[4] 0.0029 1.0050
a[5] 0.0031 1.0046
a[6] 0.0058 0.9965
a[7] 0.0596 0.7445
a[8] -0.0106 0.9836
a[9] 0.1048 1.1374
b[1] -0.0408 0.9996
b[2] -0.0040 1.0000

Mehrabani-Yeganeh, Gipson, and Schaeffer (2001) simulated populations, where animals were selected based on the additive genetic values computed with or without accounting for inbreeding. Simulation ignored the potential variation due to dominance and, therefore, inbreeding depression. They concluded that missing inbreeding did not affect animals’ ranking and consecutively affected response to selection even with high inbreeding levels. However, ignoring inbreeding results in underestimating additive genetic variance in the base population due to overestimated family segregation coefficients in non-base individuals. Additionally, when inbreeding is accounted for, the animal model, coupled with McMC methods, can infer the change of additive genetic variance over generations due to the Bulmer effect (Sorensen, Fernando, and Gianola 2001). The applicability of this method for evolutionary studies still needs to be assessed. Thus, it is better to account for inbreeding if possible.

4 Cholesky decomposition of relationship matrix

Matrix \(\mathbf{A}\) can also be decomposed via the Cholesky method, \(\mathbf{A}=\mathbf{T}\mathbf{W}^{\frac{1}{2}}\mathbf{W}^{\frac{1}{2}}\mathbf{T}^{T}=\mathbf{L}\mathbf{L}^{T}\), which is useful for efficient numerical solutions as well as for Monte Carlo simulations. The diagonal values of \(\mathbf{W}\) are computed according to the following three formulas (4.1) which correspond to scenarios (from top to bottom): both parents known, one parent known (say mother), and both parents unknown:

\[\begin{align} \mathbf{W}_{i,i} & = & 1-\frac{1}{4}\left(1+F_{f\left(i\right)}\right)-\frac{1}{4}\left(1+F_{m\left(i\right)}\right)=\frac{1}{2}-\frac{1}{4}\left(F_{f\left(i\right)}+F_{m\left(i\right)}\right), \tag{4.1} \\ \mathbf{W}_{i,i} & = & 1-\frac{1}{4}\left(1+F_{m\left(i\right)}\right)=\frac{3}{4}-\frac{1}{4}F_{m\left(i\right)},\nonumber \\ \mathbf{W}_{i,i} & = & 1.\nonumber \end{align}\] In the case of ignoring inbreeding or no inbreeding, the formulas give values: \(\frac{1}{2}\), \(\frac{3}{4}\), and \(1\).

Waldmann (2009a) reported that his implementation is faster and results in Markov chains with better mixing (smaller autocorrelations) in comparison to Damgaard (2007a). The code proposed by Waldmann (2009a) first computes the parent average and precision (1 / variance) to sample from the normal distribution with these two parameters (see Variant 1).

## Transformed additive genetic values - variant 1
AnimalModelBUGSCholVar <- function()
{
  ## Variance priors
  tau2e ~ dgamma(0.001, 0.001)
  tau2a ~ dgamma(0.001, 0.001)
  sigma2e <- 1 / tau2e
  sigma2a <- 1 / tau2a

  ## Additive genetic values
  for(k in 1:nI) {
    u[k] ~ dnorm(0, tau2a)
    a[id[k]] <- u[k] * wsqr[id[k]] + 0.5 * (a[fid[k]] + a[mid[k]])
  }
  a[nU] <- 0 # NULL (zero) holder

  ## Location priors
  for(j in 1:nB) { b[j] ~ dnorm(0, 1.0E-6) }

  ## Phenotypes
  for(i in 1:nY) {
    y[i] ~ dnorm(mu[i], tau2e)
    mu[i] <- inprod(x[i, ], b[]) + a[idy[i]]
  }
}

Damgaard (2007a), on the other hand, first sampled from the standard normal distribution, scaled by the square root of a within-family segregation variance coefficient, added the parent average, and at the end scaled by the additive genetic standard deviation. The approach of Damgaard (2007a) is essentially the transformed ( independent) sampler for additive genetic values (e.g. (Waagepetersen, Ibáñez-Escriche, and Sorensen 2008; Shariati and Sorensen 2009a)). To highlight the differences between the proposed implementations, the code by Damgaard (2007a) was rewritten in two variants (see Variant 2) and compared for speed and mixing with the code presented here (see Variant 1). The original implementations of Damgaard (2007a) and Waldmann (2009a) were not tested because they are not generic.

## Transformed additive genetic values - variant 2
AnimalModelBUGSChol2 <- function()
{
  ## Variance priors
  tau2e ~ dgamma(0.001, 0.001)
  tau2a ~ dgamma(0.001, 0.001)
  sigma2e <- 1 / tau2e
  sigma2a <- 1 / tau2a
  sigmaa <- pow(tau2a, -2)

  ## Additive genetic values
  for(k in 1:nI) {
    u[k] ~ dnorm(0, 1)
    a[id[k]] <- u[k] * wsqr[id[k]] * sigmaa + 0.5 * (a[fid[k]] + a[mid[k]])
  }
  a[nU] <- 0 # NULL (zero) holder

  ## Location priors
  for(j in 1:nB) { b[j] ~ dnorm(0, 1.0E-6) }

  ## Phenotypes
  for(i in 1:nY) {
    y[i] ~ dnorm(mu[i], tau2e)
    mu[i] <- inprod(x[i, ], b[]) + a[idy[i]]
  }
}

4.1 Example 3: Cholesky decomposition of relationship matrix “in prior”

# Database
(tmp <- AnimalModelBUGSData(ped=example, Chol=TRUE, intercept = FALSE))
## $nI
## [1] 9
## 
## $nY
## [1] 6
## 
## $y
## [1] 105  98 101 106  93 109
## 
## $nU
## [1] 10
## 
## $id
## [1] 1 2 3 4 5 6 7 8 9
## 
## $fid
## [1] 10 10  2  2  4  2  5 10  7
## 
## $mid
## [1] 10 10  1 10  3  3  6 10  8
## 
## $idy
## [1] 2 3 4 5 6 9
## 
## $wsqr
## [1] 1.0000000 1.0000000 0.7071068 0.8660254 0.7071068 0.7071068 0.6373774
## [8] 1.0000000 0.6555055
## 
## $x
##   group1 group2
## 1      1      0
## 2      1      0
## 3      0      1
## 4      0      1
## 5      0      1
## 6      1      0
## attr(,"assign")
## [1] 1 1
## attr(,"contrasts")
## attr(,"contrasts")$group
## [1] "contr.treatment"
## 
## 
## $nB
## [1] 2
## Set precisions
tmp$tau2a <- 1 / sigma2a
tmp$tau2e <- 1 / sigma2e
# model
AnimalModelBUGSChol <- function()
{
  ## Additive genetic values
  for(k in 1:nI) {
    u[k] ~ dnorm(0, tau2a)
    a[id[k]] <- u[k] * wsqr[id[k]] + 0.5 * (a[fid[k]] + a[mid[k]])
  }
  a[nU] <- 0 # NULL (zero) holder

  ## Location priors
  for(j in 1:nB) { b[j] ~ dnorm(0, 1.0E-6) }

  ## Phenotypes
  for(i in 1:nY) {
    y[i] ~ dnorm(mu[i], tau2e)
    mu[i] <- inprod(x[i, ], b[]) + a[idy[i]]
  }
}
writeJags(model = AnimalModelBUGSChol, filename = "model.txt")
initfunctionChol <- function(chain) {
  nI <- 9
  nB <- 2
  nc <- 3
  stopifnot(chain %in% 1:nc)
  u <- b <- list()
  for(i in 1:nc){
    u[[i]] <- rnorm(nI,0,2)
    b[[i]] <- rnorm(nB,100,10)
  }
  .RNG.name <- rep("base::Super-Duper", 3)[chain]
  .RNG.seed = seq(1,3,1)[chain]
  return(list(u = u[[chain]], b = b[[chain]], 
              .RNG.seed = .RNG.seed, .RNG.name = .RNG.name))
}
jags_params <- c("a", "b")

start_time <- Sys.time()
runJagsOut <- run.jags(method = "parallel",
                       model = filename,
                       monitor = jags_params,
                       data = tmp,
                       n.chains = nChains,
                       burnin = nBurninSteps,
                       sample = ceiling(nUseSteps/nChains),
                       thin = nThinSteps,
                       summarise = FALSE,
                       plots = FALSE,
                       inits = initfunctionChol)
## Calling 3 simulations using the parallel method...
## Following the progress of chain 1 (the program will wait for all chains
## to finish before continuing):
## Welcome to JAGS 4.3.0 on Tue Mar 16 17:01:21 2021
## JAGS is free software and comes with ABSOLUTELY NO WARRANTY
## Loading module: basemod: ok
## Loading module: bugs: ok
## . . Reading data file data.txt
## . Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
## Graph information:
##    Observed stochastic nodes: 6
##    Unobserved stochastic nodes: 11
##    Total graph size: 127
## . Reading parameter file inits1.txt
## . Initializing model
## . Adaptation skipped: model is not in adaptive mode.
## . Updating 30000
## -------------------------------------------------| 30000
## ************************************************** 100%
## . . . Updating 300000
## -------------------------------------------------| 300000
## ************************************************** 100%
## . . . . Updating 0
## . Deleting model
## . 
## All chains have finished
## Note: the model did not require adaptation
## Simulation complete.  Reading coda files...
## Coda files loaded successfully
## Finished running the simulation
end_time <- Sys.time()
timeChol <- difftime(end_time, start_time, units='secs')

# coda samples - MCMC
coda_samples <- as.mcmc.list(runJagsOut)

# parameter estimates
estimatesCholInbreeding <- MCMCsummary(coda_samples, round = 4)
estimatesCholInbreeding

4.2 Camparing models accounting for inbreeding using or not Cholesky decomposition

#-----------------------------------------------------------------------
# model 1 - Inb: estimates using the standard inverse matrix
# model 2 - CholInb: estimates using Cholesky decomposition
#-----------------------------------------------------------------------
data_comp <- data.frame(CholInb = estimatesCholInbreeding[c(1:9), 1],
                        Inb = estimatesInbreeding[c(1:9), 1],
                        Animal = gl(n = 9,  k = 1,
                                    labels = paste0("Animal ", 1:9)))

data_comp %>%
  ggplot(aes(y = CholInb,  x = Inb,  label = Animal)) +
  geom_point(shape = 1,  alpha = 0.9) +
  geom_abline(intercept = 0,  slope = 1) +
  geom_text_repel(
    nudge_y      = 0.05,
    direction    = "x",
    vjust        = 0,
    segment.size = 0.2) +
  ylab("Estimates considering the Cholesky decomposition") +
  xlab("Estimates considering standard inverse matrix") +
  theme_bw(base_size = 12)

data_diff <- data.frame(
  Parameter = rownames(estimatesInbreeding)[c(1:9, 11:12)],
  Difference = estimatesCholInbreeding[c(1:9,  11, 12), 1] -
    estimatesInbreeding[c(1:9,  11, 12), 1],
  Ratio = estimatesCholInbreeding[c(1:9,  11, 12), 1]/
    estimatesInbreeding[c(1:9,  11, 12), 1])

data_diff %>%
  kbl(
    caption = "Difference and ratio between parameter estimates",
    digits = 4,
  ) %>%
  kable_paper("hover", full_width = F) %>%
  column_spec(2, color = "white",
              background = spec_color(data_diff[,2], end = 0.7),
              popover = paste("am:", data_diff[,2])) %>%
  column_spec(3, color = "white", 
              background = spec_color(data_diff[,2], end = 0.7),
              popover = paste("am:", data_diff[,3]))
Table 4.1: Difference and ratio between parameter estimates
Parameter Difference Ratio
a[1] -0.0011 1.0016
a[2] 0.0007 0.9986
a[3] 0.0008 0.9994
a[4] 0.0009 1.0015
a[5] 0.0008 1.0012
a[6] 0.0027 0.9984
a[7] 0.0014 0.9940
a[8] 0.0023 1.0036
a[9] 0.0008 1.0010
b[1] -0.0008 1.0000
b[2] -0.0015 1.0000

5 Reording pedigree information

Generally, the internal ordering of the pedigree is often of little importance for end-users. However, it is essential to mention the function AnimalModelBUGSData assumes the pedigree is ordering so parents come before offspring (you can type ?animalModels::AnimalModelBUGSData in R for more details). The argument reorder from AnimalModelBUGSData function is logical, where setting reorder=TRUE causes a reordering of the pedigree (via orderPed() from the package MasterBayes) for the internal calculations, while the data is returned in the original order; if the pedigree is not ordered, inbreeding can not be TRUE because the \(\boldsymbol{A}\) matrix will bias the analysis.

5.1 Example 4: Reording pedigree information to evaluate inbreeding correctly

Here, we are inducing a wrong order in the pedigree and trying to come back into the original pedigree using the function orderPed().

## Prepair data
ord <- c(4, 6, 1, 9, 2, 5, 7, 8, 3)
(tmp <- AnimalModelBUGSData(ped=example[ord, ], Chol=TRUE,
                            reorder=TRUE, intercept = FALSE))
## $nI
## [1] 9
## 
## $nY
## [1] 6
## 
## $y
## [1] 101  93 109 105 106  98
## 
## $nU
## [1] 10
## 
## $id
## [1] 4 6 1 9 2 5 7 8 3
## 
## $fid
## [1]  2  2 10  7 10  4  5 10  2
## 
## $mid
## [1] 10  3 10  8 10  3  6 10  1
## 
## $idy
## [1] 4 6 9 2 5 3
## 
## $wsqr
## [1] 1.0000000 1.0000000 0.7071068 0.8660254 0.7071068 0.7071068 0.6373774
## [8] 1.0000000 0.6555055
## 
## $x
##   group1 group2
## 1      0      1
## 2      0      1
## 3      1      0
## 4      1      0
## 5      0      1
## 6      1      0
## attr(,"assign")
## [1] 1 1
## attr(,"contrasts")
## attr(,"contrasts")$group
## [1] "contr.treatment"
## 
## 
## $nB
## [1] 2
## Set precisions
tmp$tau2a <- 1 / sigma2a
tmp$tau2e <- 1 / sigma2e
AnimalModelBUGSChol <- function()
{
  ## Additive genetic values
  for(k in 1:nI) {
    u[k] ~ dnorm(0, tau2a)
    a[id[k]] <- u[k] * wsqr[id[k]] + 0.5 * (a[fid[k]] + a[mid[k]])
  }
  a[nU] <- 0 # NULL (zero) holder

  ## Location priors
  for(j in 1:nB) { b[j] ~ dnorm(0, 1.0E-6) }

  ## Phenotypes
  for(i in 1:nY) {
    y[i] ~ dnorm(mu[i], tau2e)
    mu[i] <- inprod(x[i, ], b[]) + a[idy[i]]
  }
}
writeJags(model = AnimalModelBUGSChol, filename = "model.txt")
#-----------------------------------------------------------------------
# Model
#-----------------------------------------------------------------------
jags_params <- c("a", "b")
runJagsOut <- run.jags(method = "parallel",
                       model = filename,
                       monitor = jags_params,
                       data = tmp,
                       n.chains = nChains,
                       burnin = nBurninSteps,
                       sample = ceiling(nUseSteps/nChains),
                       thin = nThinSteps,
                       summarise = FALSE,
                       plots = FALSE,
                       inits = initfunctionChol)
## Calling 3 simulations using the parallel method...
## Following the progress of chain 1 (the program will wait for all chains
## to finish before continuing):
## Welcome to JAGS 4.3.0 on Tue Mar 16 17:01:23 2021
## JAGS is free software and comes with ABSOLUTELY NO WARRANTY
## Loading module: basemod: ok
## Loading module: bugs: ok
## . . Reading data file data.txt
## . Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
## Graph information:
##    Observed stochastic nodes: 6
##    Unobserved stochastic nodes: 11
##    Total graph size: 127
## . Reading parameter file inits1.txt
## . Initializing model
## . Adaptation skipped: model is not in adaptive mode.
## . Updating 30000
## -------------------------------------------------| 30000
## ************************************************** 100%
## . . . Updating 300000
## -------------------------------------------------| 300000
## ************************************************** 100%
## . . . . Updating 0
## . Deleting model
## . 
## All chains have finished
## Note: the model did not require adaptation
## Simulation complete.  Reading coda files...
## Coda files loaded successfully
## Finished running the simulation
# coda samples - MCMC
coda_samples <- as.mcmc.list(runJagsOut)

# parameter estimates
estimatesCholInbreedingOrd <- MCMCsummary(coda_samples, round = 4)
estimatesCholInbreedingOrd

5.2 Comparing animal models using Cholesky decomposition

  • estimatesCholInbreeding is related to the original pedigree, while estimatesCholInbreedingOrd contains estimates from data where the pedigree was re-ordered.
data_comp <-
  data.frame(CholInb = estimatesCholInbreeding[c(1:9), 1],
             CholInbOrd = estimatesCholInbreedingOrd[c(1:9), 1],
             Animal = gl(n = 9,  k = 1,
                         labels = paste0("Animal ", 1:9)))

data_comp %>%
  ggplot(aes(y = CholInbOrd,  x = CholInb,  label = Animal)) +
  geom_point(shape = 1,  alpha = 0.9) +
  geom_abline(intercept = 0,  slope = 1) +
  geom_text_repel(
    nudge_y      = 0.05,
    direction    = "x",
    vjust        = 0,
    segment.size = 0.2) +
  ylab("Estimates considering the Cholesky decomposition with reorder") +
  xlab("Estimates considering the Cholesky decomposition") +
  theme_bw(base_size = 12)

data_diff <- data.frame(
  Parameter = rownames(estimatesCholInbreeding)[c(1:9, 11:12)],
  Difference = estimatesCholInbreedingOrd[c(1:9,  11, 12), 1] -
    estimatesCholInbreeding[c(1:9,  11, 12), 1],
  Ratio = estimatesCholInbreedingOrd[c(1:9,  11, 12), 1]/
    estimatesCholInbreeding[c(1:9,  11, 12), 1])

data_diff %>%
  kbl(
    caption = "Difference and ratio between parameter estimates",
    digits = 4,
  ) %>%
  kable_paper("hover", full_width = F) %>%
  column_spec(2, color = "white",
              background = spec_color(data_diff[,2], end = 0.7),
              popover = paste("am:", data_diff[,2])) %>%
  column_spec(3, color = "white", 
              background = spec_color(data_diff[,2], end = 0.7),
              popover = paste("am:", data_diff[,3]))
Table 5.1: Difference and ratio between parameter estimates
Parameter Difference Ratio
a[1] 0.0042 0.9940
a[2] 0.0008 0.9984
a[3] -0.0011 1.0008
a[4] 0.0012 1.0020
a[5] 0.0002 1.0003
a[6] 0.0001 0.9999
a[7] 0.0002 0.9991
a[8] 0.0000 1.0000
a[9] 0.0009 1.0012
b[1] -0.0002 1.0000
b[2] -0.0005 1.0000
  • Argument reorder is recovering the order of pedigree, and, consequently, parameter estimates is quite the same.

6 Additional parameterizarions

Additionally, two other parameterizations of the animal model are interesting which involve the incorporation of NRM into the design matrix \(\mathbf{Z}\) to remove dependence between additive genetic values, e.g., \[\mathbf{u}|\sigma_{a}^{2}\sim N\left(\mathbf{0},\mathbf{I}\sigma_{a}^{2}\right).\] One approach is to use left Cholesky factor (Quaas, Anderson, and Gilmour 1984a) \[\mathbf{Z}^{*}=\mathbf{Z}\mathbf{L}=\mathbf{Z}\mathbf{T}\mathbf{W}^{\frac{1}{2}}\], where original \(\mathbf{a}\) can be computed using \(\mathbf{a}=\mathbf{T}\mathbf{W}^{\frac{1}{2}}\mathbf{u}\) or more efficiently with solving \[\mathbf{T}^{-1}\mathbf{a}=\mathbf{W}^{\frac{1}{2}}\mathbf{u}\] due to the sparseness of \(\mathbf{T}^{-1}\) (e.g. ). In a similar manner, Waldmann et al. (2008b) used a singular value decomposition (SVD) of NRM, i.e., \(\mathbf{A}=\mathbf{U}\mathbf{S}\mathbf{U}^{T}\), where \(\mathbf{U}\) is a dense orthonormal matrix \[\left(\mathbf{U}\mathbf{U}^{T}=\mathbf{I}\right)\] and \(\mathbf{S}\) is a diagonal matrix. Then the parameterization is \[\mathbf{Z}^{*}=\mathbf{Z}\mathbf{U}\mathbf{S}^{\frac{1}{2}}\] and \[\mathbf{a}=\mathbf{U}\mathbf{S}^{\frac{1}{2}}\mathbf{u}.\] JAGS implementation for such a parameterization can be see bellow:

AnimalModelBUGSZStarVar <- function()
{
  ## Variance priors
  tau2e ~ dgamma(0.001, 0.001)
  tau2a ~ dgamma(0.001, 0.001)
  sigma2e <- 1 / tau2e
  sigma2a <- 1 / tau2a

  ## Transformed additive genetic values
  for(k in 1:nI) { u[k] ~ dnorm(0, tau2a) }

  ## Location priors
  for(j in 1:nB) { b[j] ~ dnorm(0, 1.0E-6) }

  ## Phenotypes
  for(i in 1:nY) {
    y[i] ~ dnorm(mu[i], tau2e)
    mu[i] <-  inprod(x[i, ], b[]) + inprod(ZStar[i, ], u[])
  }
}

It is simple because the additive genetic values become independent and fitted with a homogenous variance (precision), while the pedigree information is incorporated into the design matrix \(\mathbf{Z}^{*}\). However, this means that the whole dense matrix \(\mathbf{Z}^{*}\) of order \(nY\times nI\) must be passed to JAGS. Initial tests showed that the CPU time is so long that using these two BUGS language approaches is applicable only for pedigrees with a handful of members. A similar conclusion also holds for the augmented parameterization of the mixed model by Harville (1986b). Therefore, these approaches were not tested for speed and mixing.

6.1 Example 5: Animal model using \(\mathbf{Z}^{*}\) via Cholesky

#=======================================================================
### --- Animal model - ZStar via Cholesky ---
#=======================================================================
## Prepair data
(tmp <- AnimalModelBUGSData(ped=example, ZChol=TRUE, intercept = FALSE))
## $nI
## [1] 6
## 
## $nY
## [1] 6
## 
## $y
## [1] 105  98 101 106  93 109
## 
## $ZStar
##        2         3         4         5         6         9
## 1 1.0000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## 2 0.5000 0.8660254 0.0000000 0.0000000 0.0000000 0.0000000
## 3 0.5000 0.0000000 0.8660254 0.0000000 0.0000000 0.0000000
## 4 0.5000 0.4330127 0.4330127 0.7071068 0.0000000 0.0000000
## 5 0.7500 0.4330127 0.0000000 0.0000000 0.7071068 0.0000000
## 6 0.3125 0.2165064 0.1082532 0.1767767 0.1767767 0.8838835
## 
## $id
## [1] 2 3 4 5 6 9
## 
## $fact
## 6 x 6 sparse Matrix of class "dtCMatrix"
##   2         3         4         5             6         9
## 2 1 0.5000000 0.5000000 0.5000000  7.500000e-01 0.3125000
## 3 . 0.8660254 0.0000000 0.4330127  4.330127e-01 0.2165064
## 4 . .         0.8660254 0.4330127  0.000000e+00 0.1082532
## 5 . .         .         0.7071068 -3.925231e-17 0.1767767
## 6 . .         .         .          7.071068e-01 0.1767767
## 9 . .         .         .          .            0.8838835
## 
## $x
##   group1 group2
## 1      1      0
## 2      1      0
## 3      0      1
## 4      0      1
## 5      0      1
## 6      1      0
## attr(,"assign")
## [1] 1 1
## attr(,"contrasts")
## attr(,"contrasts")$group
## [1] "contr.treatment"
## 
## 
## $nB
## [1] 2
## Set precisions
tmp$tau2a <- 1 / sigma2a
tmp$tau2e <- 1 / sigma2e

## Store the right Cholesky factor and id for later use and remove it -
## this is mandatory!
U <- tmp$fact
tmp$fact <- NULL
id <- tmp$id
tmp$id <- NULL
AnimalModelBUGSZStar <- function()
{
  ## Transformed additive genetic values
  for(k in 1:nI) { u[k] ~ dnorm(0, tau2a) }

  ## Location priors
  for(j in 1:nB) { b[j] ~ dnorm(0, 1.0E-6) }

  ## Phenotypes
  for(i in 1:nY) {
    y[i] ~ dnorm(mu[i], tau2e)
    mu[i] <- inprod(x[i, ], b[]) + inprod(ZStar[i, ], u[])
  }
}
#-----------------------------------------------------------------------
# Model
#-----------------------------------------------------------------------
writeJags(model = AnimalModelBUGSZStar, filename = "model.txt")
  initfunctionZstar <- function(chain) {
    nI <- 6
    nB <- 2
    nc <- 3
    stopifnot(chain %in% 1:nc)
    u <- b <- list()
    for(i in 1:nc){
      u[[i]] <- rnorm(nI,0,2)
      b[[i]] <- rnorm(nB,100,10)
    }
    .RNG.name <- rep("base::Super-Duper", 3)[chain]
    .RNG.seed = seq(1,3,1)[chain]
    return(list(u = u[[chain]], b = b[[chain]], 
                .RNG.seed = .RNG.seed, .RNG.name = .RNG.name))
  }

jags_params <- c("u", "b")
runJagsOut <- run.jags(method = "parallel",
                       model = filename,
                       monitor = jags_params,
                       data = tmp,
                       n.chains = nChains,
                       burnin = nBurninSteps,
                       sample = ceiling(nUseSteps/nChains),
                       thin = nThinSteps,
                       summarise = FALSE,
                       plots = FALSE,
                       inits = initfunctionZstar)
## Calling 3 simulations using the parallel method...
## Following the progress of chain 1 (the program will wait for all chains
## to finish before continuing):
## Welcome to JAGS 4.3.0 on Tue Mar 16 17:01:25 2021
## JAGS is free software and comes with ABSOLUTELY NO WARRANTY
## Loading module: basemod: ok
## Loading module: bugs: ok
## . . Reading data file data.txt
## . Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
## Graph information:
##    Observed stochastic nodes: 6
##    Unobserved stochastic nodes: 8
##    Total graph size: 97
## . Reading parameter file inits1.txt
## . Initializing model
## . Adaptation skipped: model is not in adaptive mode.
## . Updating 30000
## -------------------------------------------------| 30000
## ************************************************** 100%
## . . . Updating 300000
## -------------------------------------------------| 300000
## ************************************************** 100%
## . . . . Updating 0
## . Deleting model
## . 
## All chains have finished
## Note: the model did not require adaptation
## Simulation complete.  Reading coda files...
## Coda files loaded successfully
## Finished running the simulation
# coda samples - MCMC
coda_samples <- as.mcmc.list(runJagsOut)

# parameter estimates
estimatesCholInbreedingZstar <- MCMCsummary(coda_samples, round = 4)
estimatesCholInbreedingZstar
#-----------------------------------------------------------------------
# Obtaining the a estimates
#-----------------------------------------------------------------------
tmp2 <- coda_samples[, 1:6]
tmp3 <- mcpar(tmp2[[1]])
tmp3 <- mcmc(data=as.matrix(as.matrix(tmp2) %*% U),
             start=tmp3[1], end=tmp3[2], thin=tmp3[3])
dimnames(tmp3)[[2]] <- paste("a[", id, "]", sep="")
Z_est <- c(NA, summary(tmp3)[[1]][1:5, 1], rep(NA, 2),
           summary(tmp3)[[1]][6, 1], estimatesCholInbreedingZstar[7:8, 1])
summary(tmp3)
## 
## Iterations = 31001:330986
## Thinning interval = 15 
## Number of chains = 1 
## Sample size per chain = 20000 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##         Mean     SD Naive SE Time-series SE
## a[2] -0.4990 0.5579 0.003945       0.003945
## a[3] -1.3017 0.5488 0.003880       0.003880
## a[4]  0.5932 0.5652 0.003997       0.003997
## a[5]  0.6793 0.5963 0.004216       0.004216
## a[6] -1.6511 0.6139 0.004341       0.004341
## a[9]  0.7700 0.5541 0.003918       0.003918
## 
## 2. Quantiles for each variable:
## 
##         2.5%     25%     50%     75%   97.5%
## a[2] -1.5898 -0.8762 -0.5011 -0.1174  0.5911
## a[3] -2.3589 -1.6783 -1.3022 -0.9269 -0.2217
## a[4] -0.5090  0.2093  0.5944  0.9716  1.7202
## a[5] -0.4772  0.2720  0.6784  1.0819  1.8484
## a[6] -2.8363 -2.0782 -1.6483 -1.2278 -0.4441
## a[9] -0.3100  0.3944  0.7695  1.1430  1.8527

6.2 Example 6: Animal model using \(\mathbf{Z}^{*}\) via SVD

#=======================================================================
# Animal model - ZStar via SVD
#=======================================================================
## Prepair data - Singular value decomposition (SVD)
(tmp <- AnimalModelBUGSData(ped=example, ZSVD = TRUE, intercept = FALSE))
## $nI
## [1] 6
## 
## $nY
## [1] 6
## 
## $y
## [1] 105  98 101 106  93 109
## 
## $ZStar
##         [,1]        [,2]        [,3]       [,4]        [,5]         [,6]
## 1 -0.7892960  0.01857531 -0.31341827  0.3765138 -0.36968650 -0.002249664
## 2 -0.7778530  0.33628407 -0.07299132 -0.4181467 -0.04076382  0.316261898
## 3 -0.6331143 -0.69262952 -0.09592374  0.1653480  0.21297229  0.193732027
## 4 -0.8448040 -0.35766088  0.12216732 -0.4280523 -0.10464873 -0.272543393
## 5 -0.9357211  0.42566660 -0.19264999  0.1578998  0.31566949 -0.177596377
## 6 -0.5709971  0.11579885  0.77399029  0.2403898 -0.03206039  0.051738041
## 
## $id
## [1] 2 3 4 5 6 9
## 
## $fact
## $fact$u
##            [,1]        [,2]        [,3]       [,4]        [,5]         [,6]
## [1,] -0.4192585  0.01941162 -0.35860591  0.4821170 -0.68028671 -0.004535244
## [2,] -0.4131802  0.35142444 -0.08351497 -0.5354269 -0.07501244  0.637573034
## [3,] -0.3362979 -0.72381348 -0.10975371  0.2117241  0.39190561  0.390557056
## [4,] -0.4487432 -0.37376369  0.13978101 -0.5481108 -0.19257165 -0.549438042
## [5,] -0.4970366  0.44483120 -0.22042565  0.2021869  0.58088611 -0.358028146
## [6,] -0.3033024  0.12101241  0.88558172  0.3078135 -0.05899663  0.104302099
## 
## $fact$d
## [1] 3.5441826 0.9156904 0.7638600 0.6098977 0.2953134 0.2460559
## 
## 
## $x
##   group1 group2
## 1      1      0
## 2      1      0
## 3      0      1
## 4      0      1
## 5      0      1
## 6      1      0
## attr(,"assign")
## [1] 1 1
## attr(,"contrasts")
## attr(,"contrasts")$group
## [1] "contr.treatment"
## 
## 
## $nB
## [1] 2
## Set precisions
tmp$tau2a <- 1 / sigma2a
tmp$tau2e <- 1 / sigma2e

## Store the SVD factor and id for later use and remove it - this is
## mandatory!
svdUDV <- tmp$fact
tmp$fact <- NULL
id <- tmp$id
tmp$id <- NULL
AnimalModelBUGSZStar <- function()
{
  ## Transformed additive genetic values
  for(k in 1:nI) { u[k] ~ dnorm(0, tau2a) }

  ## Location priors
  for(j in 1:nB) { b[j] ~ dnorm(0, 1.0E-6) }

  ## Phenotypes
  for(i in 1:nY) {
    y[i] ~ dnorm(mu[i], tau2e)
    mu[i] <- inprod(x[i, ], b[]) + inprod(ZStar[i, ], u[])
  }
}
#-----------------------------------------------------------------------
# Model
#-----------------------------------------------------------------------
writeJags(model = AnimalModelBUGSZStar, filename = "model.txt")
jags_params <- c("u", "b")

runJagsOut <- run.jags(method = "parallel",
                       model = filename,
                       monitor = jags_params,
                       data = tmp,
                       n.chains = nChains,
                       burnin = nBurninSteps,
                       sample = ceiling(nUseSteps/nChains),
                       thin = nThinSteps,
                       summarise = FALSE,
                       plots = FALSE,
                       inits = initfunctionZstar)
## Calling 3 simulations using the parallel method...
## Following the progress of chain 1 (the program will wait for all chains
## to finish before continuing):
## Welcome to JAGS 4.3.0 on Tue Mar 16 17:01:27 2021
## JAGS is free software and comes with ABSOLUTELY NO WARRANTY
## Loading module: basemod: ok
## Loading module: bugs: ok
## . . Reading data file data.txt
## . Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
## Graph information:
##    Observed stochastic nodes: 6
##    Unobserved stochastic nodes: 8
##    Total graph size: 97
## . Reading parameter file inits1.txt
## . Initializing model
## . Adaptation skipped: model is not in adaptive mode.
## . Updating 30000
## -------------------------------------------------| 30000
## ************************************************** 100%
## . . . Updating 300000
## -------------------------------------------------| 300000
## ************************************************** 100%
## . . . . Updating 0
## . Deleting model
## . 
## All chains have finished
## Note: the model did not require adaptation
## Simulation complete.  Reading coda files...
## Coda files loaded successfully
## Finished running the simulation
# coda samples - MCMC
coda_samples <- as.mcmc.list(runJagsOut)

# parameter estimates
estimatesZSVDInbreedingZstar <- MCMCsummary(coda_samples, round = 4)
estimatesZSVDInbreedingZstar
#-----------------------------------------------------------------------
# Obtaining the a estimates
#-----------------------------------------------------------------------
tmp2 <- coda_samples[, 1:6]
tmp3 <- mcpar(tmp2[[1]])
tmp3 <- mcmc(data=as.matrix(t(svdUDV$u %*% Diagonal(x=sqrt(svdUDV$d)) %*% t(as.matrix(tmp2[[1]])))),
             start=tmp3[1], end=tmp3[2], thin=tmp3[3])
dimnames(tmp3)[[2]] <- paste("a[", id, "]", sep="")
Z_est_ZSVD <- c(NA, summary(tmp3)[[1]][1:5, 1], rep(NA, 2),
           summary(tmp3)[[1]][6, 1], estimatesZSVDInbreedingZstar[7:8, 1])
summary(tmp3)
## 
## Iterations = 31001:330986
## Thinning interval = 15 
## Number of chains = 1 
## Sample size per chain = 20000 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##         Mean     SD Naive SE Time-series SE
## a[2] -0.5075 0.5544 0.003920       0.003920
## a[3] -1.3112 0.5508 0.003895       0.003895
## a[4]  0.5862 0.5609 0.003966       0.003827
## a[5]  0.6630 0.5970 0.004221       0.004221
## a[6] -1.6604 0.6165 0.004359       0.004409
## a[9]  0.7640 0.5531 0.003911       0.003911
## 
## 2. Quantiles for each variable:
## 
##         2.5%     25%     50%     75%   97.5%
## a[2] -1.5802 -0.8860 -0.5063 -0.1337  0.5844
## a[3] -2.3852 -1.6840 -1.3103 -0.9380 -0.2409
## a[4] -0.5113  0.2073  0.5916  0.9660  1.6831
## a[5] -0.4877  0.2557  0.6593  1.0656  1.8408
## a[6] -2.8625 -2.0740 -1.6675 -1.2447 -0.4445
## a[9] -0.3220  0.3924  0.7643  1.1396  1.8430

6.3 Comparing parameters estimate using \(\mathbf{Z}^{*}\) via Cholesky and SVD

data_comp <-
  data.frame(CholInb = Z_est[1:9],
             SVDInb = Z_est_ZSVD[1:9],
             Animal = gl(n = 9,  k = 1,
                         labels = paste0("Animal ", 1:9)))

data_comp %>%
  ggplot(aes(y = SVDInb,  x = CholInb,  label = Animal)) +
  geom_point(shape = 1,  alpha = 0.9) +
  geom_abline(intercept = 0,  slope = 1) +
  geom_text_repel(
    nudge_y      = 0.05,
    direction    = "x",
    vjust        = 0,
    segment.size = 0.2) +
  ylab("Estimates considering the Singular Value Decomposition") +
  xlab("Estimates considering the Cholesky decomposition") +
  theme_bw(base_size = 12)

data_diff <- data.frame(
  Parameter = c(paste0("a[",1:9,"]"),paste0("b[",1:2,"]")),
  Difference = Z_est_ZSVD[1:11] - Z_est[1:11],
  Ratio = Z_est_ZSVD[1:11]/Z_est[1:11])

data_diff %>%
  kbl(
    caption = "Difference and ratio between parameter estimates",
    digits = 4,
  ) %>%
  kable_paper("hover", full_width = F) %>%
  column_spec(2, color = "white",
              background = spec_color(data_diff[,2], end = 0.7),
              popover = paste("am:", data_diff[,2])) %>%
  column_spec(3, color = "white", 
              background = spec_color(data_diff[,2], end = 0.7),
              popover = paste("am:", data_diff[,3]))
Table 6.1: Difference and ratio between parameter estimates
Parameter Difference Ratio
a[1] NA NA
a[2] -0.0085 1.0170
a[3] -0.0096 1.0074
a[4] -0.0070 0.9882
a[5] -0.0164 0.9759
a[6] -0.0093 1.0056
a[7] NA NA
a[8] NA NA
a[9] -0.0060 0.9922
b[1] 0.0008 1.0000
b[2] 0.0044 1.0000

7 Model estimates summary

Consider the following general models:

  • Model 1 (m1): AM without accounting for inbreeding
  • Model 2 (m2): AM accounting for inbreeding
  • Model 3 (m3): AM accounting for inbreeding and Cholesky Decomposition
  • Model 4 (m4): AM accounting for inbreeding, Cholesky Decomposition, and Reording Pedigree
  • Model 5 (m5): AM accounting for inbreeding, Cholesky Decomposition, and \(\mathbf{Z}^{*}\)
  • Model 6 (m6): AM accounting for inbreeding, SVD Decomposition, and \(\mathbf{Z}^{*}\)

7.1 Variance considered as known

Table 7.1: Parameter estimates from different models considering variance known
Parameters m1 m2 m3 m4 m5 m6
a[1] -0.6997 -0.7023 -0.7034 -0.6992 NA NA
a[2] -0.4985 -0.5076 -0.5069 -0.5061 -0.4990 -0.5075
a[3] -1.2976 -1.3061 -1.3053 -1.3064 -1.3017 -1.3112
a[4] 0.5880 0.5851 0.5860 0.5872 0.5932 0.5862
a[5] 0.6708 0.6677 0.6685 0.6687 0.6793 0.6630
a[6] -1.6537 -1.6595 -1.6568 -1.6567 -1.6511 -1.6604
a[7] -0.1737 -0.2333 -0.2319 -0.2317 NA NA
a[8] 0.6372 0.6478 0.6501 0.6501 NA NA
a[9] 0.8675 0.7627 0.7635 0.7644 0.7700 0.7640
b[1] 104.3131 104.3539 104.3531 104.3529 104.3479 104.3487
b[2] 100.1340 100.1380 100.1365 100.1360 100.1322 100.1366
sigma2a 0.3333 0.3333 0.3333 0.3333 0.3333 0.3333
sigma2e 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000

7.2 Variance considered as unknown

Table 7.2: Parameter estimates from different models considering variance unknown
Parameters m1 m2 m3 m4 m5 m6
a[1] -2.0188 -2.0454 -2.0251 -2.0226 NA NA
a[2] 0.1436 0.0861 0.0482 0.1378 -0.0475 0.2375
a[3] -2.9251 -2.9968 -3.0607 -3.0535 -3.0616 -2.8288
a[4] 0.9094 0.8918 0.8190 0.8908 0.6421 0.9579
a[5] 2.9099 2.9109 2.8608 2.9507 2.6269 2.9717
a[6] -3.1429 -3.1715 -3.2691 -3.2944 -3.3911 -3.0758
a[7] 0.5561 0.4647 0.3962 0.4076 NA NA
a[8] 1.3607 1.4729 1.4568 1.5443 NA NA
a[9] 2.2538 2.1805 2.1897 2.2806 2.0146 2.3233
b[1] 104.1735 104.2421 104.2664 104.2121 104.2732 104.2422
b[2] 99.7614 99.7726 99.8482 99.8105 99.8809 99.8688
sigma2a 66.3164 67.5702 67.6482 66.4938 76.5733 73.4374
sigma2e 42.3992 42.2566 42.2971 41.3676 42.4762 42.4870

8 NIMBLE implementation

Here we show the NIMBLE implementation using an illustration of the Animal model accounting for inbreeding and using Cholesky decomposition of NRM. Other animal models can be easily extended using the code below.

library(parallel)

# NIMBLE function
run_MCMC <- function(seed = sample(1:1e5, 1),  data) {
  library(nimble)
  sigma2a <- 1/3
  sigma2e <- 1
  nimble_params <- c("a", "b")
  nChains <- 3
  nBurninSteps <- 30000
  nThinSteps <- 15
  nUseSteps <- 60000
  #---------------------------------------------------------------------
  # Code
  #---------------------------------------------------------------------
  AnimalModelChol_Nimble <- nimbleCode({
  ## Variance priors
    ## Additive genetic values
    for(k in 1:nI) {
      u[k] ~ dnorm(0, tau2a)
      a[id[k]] <- u[k] * wsqr[id[k]] + 0.5 * (a[fid[k]] + a[mid[k]])
    }
    a[nU] <- 0 # NULL (zero) holder

    ## Fixed Effects
    for(j in 1:nB) { b[j] ~ dnorm(0, 1.0E-6) }

    ## Phenotypes
    for(i in 1:nY) {
      y[i] ~ dnorm(mu[i], tau2e)
      mu[i] <- inprod(x[i, 1:nB], b[1:nB]) + a[idy[i]]
    }
  })
  #---------------------------------------------------------------------
  # Model
  #---------------------------------------------------------------------
  myModel <- nimbleModel(
    code = AnimalModelChol_Nimble,
    constants = list(nI = data$nI,
                     nY = data$nY,
                     nU = data$nU,
                     nB = data$nB,
                     id = data$id,
                     idy = data$idy,
                     fid = data$fid,
                     mid = data$mid,
                     wsqr = data$wsqr,
                     tau2a = 1/sigma2a,
                     tau2e = 1/sigma2e
                     ),
    data = list(y = data$y,
                x = data$x),
    inits = list(u = rnorm(data$nI,0,2),
                 b = rnorm(data$nB, 0, 1))
  )
  #---------------------------------------------------------------------
  # Configuring model
  #---------------------------------------------------------------------
  compModel <- compileNimble(myModel)
  confModel <- configureMCMC(
    compModel,
    monitors = nimble_params
  )
  #---------------------------------------------------------------------
  # Building model
  #---------------------------------------------------------------------
  myMCMC <- buildMCMC(confModel)
  #---------------------------------------------------------------------
  # Compile MCMC
  #---------------------------------------------------------------------
  compMCMC <- compileNimble(myMCMC)
  #---------------------------------------------------------------------
  # Running MCMC
  #---------------------------------------------------------------------
  start_time <- Sys.time()
  results <- runMCMC(compMCMC,
                     nburnin = nBurninSteps,
                     niter = ceiling((nUseSteps/nChains) * nThinSteps),
                     thin = nThinSteps,
                     setSeed = seed,
                     samplesAsCodaMCMC = TRUE)
  end_time <- Sys.time()
  time <- difftime(end_time, start_time, units='secs')  
  return(list(results = results, time = time))
}

#=======================================================================
# Parallel computing
#=======================================================================
tmp <- AnimalModelBUGSData(ped=example,  Chol = TRUE, intercept = FALSE)
#-------------------------------------------------------------------
# Model
#-------------------------------------------------------------------
this_cluster <- makeCluster(3)
start_time2 <- Sys.time() 
chains_all <- parLapply(cl = this_cluster, X = 1:3, fun = run_MCMC, 
                    data = tmp)
end_time2 <- Sys.time()
timeNimble2 <- difftime(end_time2, start_time2, units='secs')  
stopCluster(this_cluster)
chains <- list(chains_all[[1]]$results,
               chains_all[[2]]$results,
               chains_all[[3]]$results)

timeNimble <- mean(c(chains_all[[1]]$time,
                     chains_all[[2]]$time,
                     chains_all[[3]]$time))
#-------------------------------------------------------------------
# coda samples - MCMC
coda_samples <- as.mcmc.list(chains)

# Total number of samples
(Nsamples <- nrow(coda_samples[[1]])*3)
## [1] 54000
# parameter estimates
(estimatesCholNimble <- MCMCsummary(coda_samples, round = 4))

8.1 Comparing NIMBLE and JAGS outputs

Here we can see that estimates calculated based on both JAGS and NIMBLE packages are approximately the same.

data_comp <-
  data.frame(CholInbJags = estimatesCholInbreeding[c(1:9), 1],
             CholInbNimble = estimatesCholNimble[c(1:9), 1],
             Animal = gl(n = 9,  k = 1,
                         labels = paste0("Animal ", 1:9)))

data_comp %>%
  ggplot(aes(y = CholInbNimble,  x = CholInbJags,  label = Animal)) +
  geom_point(shape = 1,  alpha = 0.9) +
  geom_abline(intercept = 0,  slope = 1) +
  geom_text_repel(
    nudge_y      = 0.05,
    direction    = "x",
    vjust        = 0,
    segment.size = 0.2) +
  ylab("Estimates considering the Cholesky decomposition using NIMBLE") +
  xlab("Estimates considering the Cholesky decomposition using JAGS") +
  theme_bw(base_size = 12)

data_diff <- data.frame(
  Parameter = rownames(estimatesCholInbreeding)[c(1:9, 11:12)],
  Difference = estimatesCholNimble[c(1:9,  11, 12), 1] -
    estimatesCholInbreeding[c(1:9,  11, 12), 1],
  Ratio = estimatesCholNimble[c(1:9,  11, 12), 1]/
    estimatesCholInbreeding[c(1:9,  11, 12), 1])

data_diff %>%
  kbl(
    caption = "Difference and ratio between parameter estimates",
    digits = 4,
  ) %>%
  kable_paper("hover", full_width = F) %>%
  column_spec(2, color = "white",
              background = spec_color(data_diff[,2], end = 0.7),
              popover = paste("am:", data_diff[,2])) %>%
  column_spec(3, color = "white", 
              background = spec_color(data_diff[,2], end = 0.7),
              popover = paste("am:", data_diff[,3]))
Table 8.1: Difference and ratio between parameter estimates
Parameter Difference Ratio
a[1] 0.0044 0.9937
a[2] 0.0009 0.9982
a[3] -0.0009 1.0007
a[4] -0.0015 0.9974
a[5] -0.0022 0.9967
a[6] -0.0019 1.0011
a[7] -0.0031 1.0134
a[8] -0.0006 0.9991
a[9] -0.0003 0.9996
b[1] -0.0027 1.0000
b[2] 0.0034 1.0000

When running a small example with \(n=9\) individuals, JAGS takes 1.371 seconds to draw 60.000 samples. In contrast, NIMBLE takes 2.945 seconds to draw 54.000 samples using the same BUGS model. In the context of a small example, JAGS should be preferred to be used to analyze the data as JAGS is faster because it does not need to spend time to compile the model as NIMBLE, which takes approximate 22.456 seconds to compile everything. However, this result cannot be extended for a more complex data set as NIMBLE uses aC++ compiler to speed up the analysis. Furthermore, if we highly increase the number of drawn samples in the MCMC procedure, we may also change the reported result.

9 The effective number of simulation draws based on JAGS program

We can compute an ‘effective number of independent simulation draws’ for any estimator of interest, called here as \(E\left(\mu|y\right)=\bar{\mu}\) (Gelman et al. 2014). Supposing we have \(n\) simulation draws in the \(m\)-th MCMC chain, \(m=1,2,\ldots, M\), we would have a total of \(mn\) independent samples from the \(m\) MCMC chains if there is no autocorrelation. Consequently, an unbiased estimate of the posterior variance \(Var\left(\bar{\mu}\right) = \frac{Var\left(\mu|y\right)}{nm}\). On the other hand, in the presence of autocorrelantion, we can define an estimator of the effective sample size (\(n_{eff}\)) as \[\widehat{n}_{eff} = \frac{mn}{1 + 2\sum_{t=1}^{T}\widehat{\rho}_{t}},\] where \(\widehat{\rho}_t\) is the estimator of the correlation of the sequence \(\mu\) at lag \(t\), as described by Gelman et al. (2014). Ideally, we would desire that \(\widehat{n}_{eff}=mn\), or at least a good approximation of \(mn\), which happens when \(\sum_{t=1}^{T}\widehat{\rho}_{t} \approx 0\).

Table 8.3 shows the estimate of the effective sample size ratio \(\widehat{R}_{eff} = \displaystyle\frac{\widehat{n}_{eff}}{mn}\), in which \(R_{eff} = 1\) represents the ideal situation.

Table 9.1: Estimate of the effective sample size ratio (\(\widehat{R}_{eff}\)) for each parameter in the model
\(\widehat{R}_{eff}\)
Model Type a[1] a[2] a[3] a[4] a[5] a[6] a[7] a[8] a[9] b[1] b[2]
m1 No Inbreeding 0.9975 0.8890 0.8723 0.9068 0.8954 0.8645 0.8632 0.9856 0.9002 0.9326 0.9060
m2 Inbreeding 0.9888 0.8774 0.8537 0.8981 0.8784 0.8488 0.8383 0.9856 0.8878 0.9248 0.8969
m3 Inbreeding + Cholesky Decomposition 1.0000 1.0069 1.0078 1.0000 1.0000 1.0000 0.9836 1.0228 1.0000 1.0000 0.9958
m4 Inbreeding + Cholesky Decomposition + Reording Pedigree 1.0000 0.9920 0.9920 1.0000 0.9883 1.0189 0.9842 1.0227 0.9677 1.0000 1.0000
m5 Inbreeding + Cholesky Decomposition + Zstar NA 1.0000 1.0000 1.0126 1.0000 0.9834 NA NA 1.0000 1.0000 1.0000
m6 Inbreeding + Zstar + SVD NA 1.0000 1.0452 1.0017 1.0000 0.9872 NA NA 1.0000 1.0000 1.0000

10 Computing cost

Waldmann (2009b) reported that his implementation is faster and results in Markov chains with better mixing (smaller autocorrelations) in comparison to Damgaard (2007b). The code proposed by Waldmann (2009b) and here AnimalModelBUGSVar) first computes the parent average and precision (1 / variance) to sample from the normal distribution with these two parameters. Damgaard (2007b), on the other hand, first sampled from the standard normal distribution, scaled by the square root of a within-family segregation variance coefficient added the parent average, and at the end scaled by the additive genetic standard deviation. The approach of Damgaard (2007b) is essentially the transformed (a priori independent) sampler for additive genetic values (e.g. Waagepetersen, N., and Sorensen (2008), and Shariati and Sorensen (2009b)). To highlight the differences between the proposed implementations, the code by Damgaard (2007b) was rewritten in two variants (functions Variantes 1 and Variantes 2) and compared for speed and mixing with the code presented here AnimalModelBUGSVar). The original implementations of Damgaard (2007b) and Waldmann (2009b) were not tested because they are not generic.

See bellow two variants of the animal model with transformed (a priori independent) additive genetic values

# Variant 1
AnimalModelBUGSVariant1Var <- function()
{
  ## Priors - precision parameters
  tau2e ~ dgamma(0.001, 0.001)
  tau2a ~ dgamma(0.001, 0.001)
  sigma2e <- pow(tau2e, -1)
  sigma2a <- pow(tau2a, -1)

  ## Transformed additive genetic values - variant 1
  for(k in 1:nI) {
    u[k] ~ dnorm(0, tau2a)
    a[id[k]] <- u[k] * wsqr[id[k]] + 
      0.5 * (a[fid[k]] + a[mid[k]])
    }
  a[nU] <- 0 # NULL (zero) holder

  ## Location priors
  for(j in 1:nB) {
    b[j] ~ dnorm(0, 1.0E-6)
  }

  ## Phenotypes
  for(i in 1:nY) {
    y[i] ~ dnorm(mu[i], tau2e)
    mu[i] <- inprod(x[i, ], b[]) + a[idy[i]]
  }
}

# Variant 2
AnimalModelBUGSVariant2Var <- function()
{
  ## Priors - precision parameters
  tau2e ~ dgamma(0.001, 0.001)
  tau2a ~ dgamma(0.001, 0.001)
  sigma2e <- pow(tau2e, -1)
  sigma2a <- pow(tau2a, -1)

  ## Transformed additive genetic values - variant 2
  sigmaa <- pow(tau2a, -2) 
  for(k in 1:nI) { 
    u[k] ~ dnorm(0, 1)
    a[id[k]] <- u[k] * wsqr[id[k]] * sigmaa + 
      0.5 * (a[fid[k]] + a[mid[k]])
    }
  a[nU] <- 0 # NULL (zero) holder

  ## Location priors
  for(j in 1:nB) {
    b[j] ~ dnorm(0, 1.0E-6)
  }

  ## Phenotypes
  for(i in 1:nY) {
    y[i] ~ dnorm(mu[i], tau2e)
    mu[i] <- inprod(x[i, ], b[]) + a[idy[i]]
  }
}

Additionally, two other parameterizations of the animal model are interesting which involve the incorporation of NRM into the design matrix \(\mathbf{Z}\) to remove a priori dependence between additive genetic values, e.g., \(\mathbf{u}|\sigma_{a}^{2}\sim N\left(\mathbf{0},\mathbf{I}\sigma_{a}^{2}\right)\). One approach is to use left Cholesky factor \(\mathbf{Z}^{*}=\mathbf{Z}\mathbf{L}=\mathbf{Z}\mathbf{T}\mathbf{W}^{\frac{1}{2}}\) (Quaas, Anderson, and Gilmour 1984b), where original \(\mathbf{a}\) can be computed using \(\mathbf{a}=\mathbf{T}\mathbf{W}^{\frac{1}{2}}\mathbf{u}\) or more efficiently with solving \(\mathbf{T}^{-1}\mathbf{a}=\mathbf{W}^{\frac{1}{2}}\mathbf{u}\) due to the sparseness of \(\mathbf{T}^{-1}\) (e.g. Waagepetersen, N., and Sorensen (2008)). In a similar manner, Waldmann et al. (2008a) used a singular value decomposition of NRM, i.e., \(\mathbf{A}=\mathbf{U}\mathbf{S}\mathbf{U}^{T}\), where \(\mathbf{U}\) is a dense orthonormal matrix \(\left(\mathbf{U}\mathbf{U}^{T}=\mathbf{I}\right)\) and \(\mathbf{S}\) is a diagonal matrix. Then the parameterization is \(\mathbf{Z}^{*}=\mathbf{Z}\mathbf{U}\mathbf{S}^{\frac{1}{2}}\) and \(\mathbf{a}=\mathbf{U}\mathbf{S}^{\frac{1}{2}}\mathbf{u}\). JAGS implementation for such a parameterization AnimalModelBUGSZStarVar) is straightforward because the additive genetic values become independent and fitted with a homogenous variance (precision), while the pedigree information is incorporated into the design matrix \(\mathbf{Z}^{*}\). However, this means that the whole dense matrix \(\mathbf{Z}^{*}\)of order \(nY\times nI\) must be passed to JAGS. Initial tests showed that the CPU time is so long that using these two JAGS approaches is applicable only for pedigrees with a handful of members. A similar conclusion also holds for the augmented parameterization of the mixed model by Harville (1986a). Therefore, these approaches were not tested for speed and mixing.

Consider now the code below, which shows an animal model with a reparameterized design matrix \(\mathbf{Z}\).

 1 ## Transformed additive genetic values
 2 for(k in 1:nI) { u[k] ~ dnorm(0, tau2a) }
 3
 4 ## Location priors
 5 for(j in 1:nB) { b[j] ~ dnorm(0, 1.0E-6) }
 6
 7 ## Phenotypes
 8 for(i in 1:nY) {
 9   y[i] ~ dnorm(mu[i], tau2e)
10   mu[i] <- inprod(x[i, ], b[]) + inprod(u[], ZStar[i, ])
11 }

Implementations (functions AnimalModelBUGSVar, Variantes 1, and Variantes 2) were tested using our simple example described in section 1. Here, the following two approaches were used: one with variances assumed known and one with variances unknown in addition to other model parameters (\(\mathbf{b}\) and \(\mathbf{a}\)).

Sampling time reported by JAGS was collected, and for each parameter, an effective sample size (\(n_{eff}\)) was computed using the CODA program Plummer et al. (2009). The last parameter describes the amount of information obtained in the collected samples from MCMC sampling. The effective sample size is almost always smaller than the sample size due to autocorrelations between the Markov chain’s subsequent samples (e.g. Kass et al. (1998)). Here we are interested in MCMC algorithms’ efficiency for generating accurate posterior samples in a Bayesian analysis. We are using the MCMC pace or computing cost per sample, which measures the average time needed to generate each effectively independent sample, to summarise algorithm efficiency. Thus, the computing cost per \(N_{s}\) samples for the parameter \(p\) \(\left(c_{p}\right)\) was evaluated as \(c_p=N_{s} \times t_{n}/n_{eff}^{(p)}\), where \(t_{n}\) is the time spent for drawing \(n\) samples. Considering the model has \(P\) parameters, we can summarise the average computing cost per sample as \(\bar{c}=\frac{N_{s} \times t_{n}}{P}\sum_{p=1}^{P}1/n_{eff}^{(p)}\), where \(n_{eff}^{(p)}\) is the \(n_{eff}\) for the \(p\)th parameter. However, often one or few parameters may have a worse mixing than the others and, consequently, limit deciding which model should be adequately chosen. For this reason, we are also considering to include maximum computing cost per \(N_{s}\) samples (\(c_{max} = \max\lbrace{c_{p}\rbrace}_{p=1}^{P}\)) that represents the longest time taken to generate each effectively independent sample.

All computations were performed on a Dell Inspiron 17 7000 with 10\(^{\tiny \mbox{th}}\) Generation Intel\(\circledR\) Core\(^{\tiny\mbox{TM}}\) i7 processor, 1.80GHz \(\times\) four-processor speed, 16GB random access memory (RAM) plus 20GB of swap space, 64-bit integers, and the platform used is a Linux Mint 19.2 Cinnamon system version 5.2.2-050202-generic.

10.1 Example 1

We are running three chains, each of one has chain length, burn-in, and thinning rate were arbitrarily set to 20,000, 30,000, and 15, respectively, and the data example, which can be seen below:

example

Tables 10.1 and 10.2 show the implementation using transformed additive genetic values for known and unknown variances, respectively (Damgaard (2007b); functions Variantes 1, and Variantes 2). In general, models m1, m2, m7, and m8 show the lowest amount of computing time while m3, m5 and m7 the lowest computing cost per sample independently if variance components are known or unknown.

Table 10.1: Effect of animal model (known variances) implementation in JAGS using the mean and standard error of sampling time (t) in seconds, computing cost per 10,000 samples (\(c_{p}=10,000 \times t_{n}/n_{eff}\)), mean computing cost per 10,000 samples (\(\bar{c}\)), and maximum computing cost per 10,000 samples (\(c_{max}\)) based on 50 models fitted using three chains each considering a length of 20,000, burn-in set to 30,000, and thinning to 15
Time
Computing cost per sample (c)
Model Type Mean SE a[1] a[2] a[3] a[4] a[5] a[6] a[7] a[8] a[9] b[1] b[2] cost_Mean cost_max
m1 No Inbreeding 1.2168 0.0078 0.2033 0.2281 0.2325 0.2237 0.2265 0.2346 0.2349 0.2058 0.2253 0.2175 0.2239 0.2233 0.2349
m2 Inbreeding 1.1781 0.0285 0.1986 0.2238 0.2300 0.2186 0.2235 0.2313 0.2342 0.1992 0.2212 0.2123 0.2189 0.2192 0.2342
m3 Inbreeding + Cholesky Decomposition 1.5529 0.0349 0.2588 0.2570 0.2568 0.2588 0.2588 0.2588 0.2631 0.2531 0.2588 0.2588 0.2599 0.2584 0.2631
m4 Inbreeding + Cholesky Decomposition (Variant 1) 2.7878 0.0725 0.4646 0.4614 0.4610 0.4646 0.4646 0.4646 0.4724 0.4543 0.4646 0.4646 0.4666 0.4640 0.4724
m5 Inbreeding + Cholesky Decomposition (Variant 2) 2.7865 0.0915 0.4644 0.4644 0.4644 0.4644 0.4644 0.4644 0.4594 0.4505 0.4703 0.4655 0.4644 0.4633 0.4703
m6 Inbreeding + Cholesky Decomposition + Reording Pedigree 1.4767 0.0267 0.2461 0.2481 0.2481 0.2461 0.2490 0.2416 0.2501 0.2407 0.2543 0.2461 0.2461 0.2469 0.2543
m7 Inbreeding + Cholesky Decomposition + Zstar 1.2577 0.0254 NA 0.2096 0.2096 0.2070 0.2096 0.2132 NA NA 0.2096 0.2096 0.2096 0.2097 0.2132
m8 Inbreeding + Zstar + SVD 1.2598 0.0318 NA 0.2100 0.2009 0.2096 0.2100 0.2127 NA NA 0.2100 0.2100 0.2100 0.2091 0.2127
Table 10.2: Effect of animal model (unknown variances) implementation in JAGS using the mean and standard error of sampling time (t) in seconds, computing cost per 10,000 samples (\(c_{p}=10,000 \times t_{n}/n_{eff}\)), mean computing cost per 10,000 samples (\(\bar{c}\)), and maximum computing cost per 10,000 samples (\(c_{max}\)) based on 50 models fitted using three chains each considering a length of 20,000, burn-in set to 30,000, and thinning to 15
Time
Computing cost per sample (c)
Model Type Mean SE a[1] a[2] a[3] a[4] a[5] a[6] a[7] a[8] a[9] b[1] b[2] cost_Mean cost_max
m1 No Inbreeding 2.2973 0.0847 1.0568 4.9257 5.0535 5.6073 5.8800 5.5331 3.7477 0.7853 5.0837 3.7550 4.3859 4.1649 5.8800
m2 Inbreeding 2.5375 0.0819 1.1750 5.5892 5.7894 6.3820 6.6601 6.1996 4.5385 0.9712 5.8833 4.2892 5.0148 4.7720 6.6601
m3 Inbreeding + Cholesky Decomposition 2.6787 0.0947 3.2560 4.6798 4.9114 4.7885 5.0266 4.8739 4.0257 2.3665 4.8109 3.6795 3.9214 4.2127 5.0266
m4 Inbreeding + Cholesky Decomposition (Variant 1) 2.8721 0.0823 3.4911 5.0177 5.2661 5.1343 5.3896 5.2259 4.3164 2.5374 5.1583 3.9452 4.2046 4.5170 5.3896
m5 Inbreeding + Cholesky Decomposition (Variant 2) 8.0390 0.2557 10.5776 12.9997 16.1426 13.6625 15.3945 16.1816 11.7857 9.1990 14.3810 7.2712 7.9202 12.3196 16.1816
m6 Inbreeding + Cholesky Decomposition + Reording Pedigree 2.3657 0.1066 3.0431 4.6697 4.6734 4.6468 4.7125 4.6799 4.0322 2.5079 4.6762 3.5489 3.6362 4.0752 4.7125
m7 Inbreeding + Cholesky Decomposition + Zstar 2.6427 0.0696 NA 5.4918 4.5485 4.3096 4.1305 3.6857 NA NA 4.3061 4.2541 4.1676 4.3617 5.4918
m8 Inbreeding + Zstar + SVD 2.2873 0.0841 NA 4.8852 3.3503 3.1976 1.0630 2.9674 NA NA 2.7419 3.3785 3.5778 3.1452 4.8852

The original implementation of Damgaard ((Damgaard 2007b); variant 2 in function Variantes 2) is the most time-consuming approach and can be even more time consuming when variances are inferred.

Although slower, the implementation with the transformed additive genetic values constantly produced chains with a considerably larger effective sample size due to `a prior independence among additive genetic values. Different implementations were compared by the computing cost per effective sample size (Tables 10.1 and 10.2) as suggested by Waagepetersen, N., and Sorensen (2008). When variances were assumed to be known, the cost did not differ a lot between implementations, though, on average, a standard implementation was the best one. This implementation was also the best when variances were unknown.

Shariati and Sorensen (2009b) performed extensive tests of various MCMC samplers for animal model and concluded that none of the samplers was the best in all scenarios, but transformed and blocked samplers could be beneficial when there are large posterior correlations between the parameters.

10.2 Example 2

Let us now test all model implementation. We used a more complex example using 5,000 to 10,000 animals rather than just with 9 of them. Nevertheless, the idea is still to show a simplistic breeding scheme. Here, we also considered one approach using variances assumed known and the other assuming variances as unknown and other model parameters, such as year and sex effects. We used three chains: chain length, burn-in, and thinning rate per chain were arbitrarily set to 5,000, 3,000, and 10, respectively.

10.2.1 Simulation procedure

We simulate a simple beef cattle scheme using the AlphaSimR (Gaynor, Gorjanc, and Hickey 2020) package and the work proposed by Lourenco et al. (2015) to set up initial parameter values. The simulated population has 1,000 animals (500 females and 500 males) per generation, where the generation \(F_0\) (founders) was generated using the function runMacs() to produce founder haplotypes. We also considered ten chromosomes, 1,000 QLT per chromosome, 5,000 SNPs per chromosome, and nine generations. The mean genetic value for birth weight (BiW) was set up to 40 kg, genetic variance equal to 20.25 kg\(^2\), and heritability of \(0.4\). Furthermore, we also established that males have 1 kilogram more than females on average.

#-----------------------------------------------------------------------
# Definitions
#-----------------------------------------------------------------------
nInd = 1000 # number of individuals
nQtl = 10^3 # number of QTL's / Chromosome
nSnp = 5*10^3 # number of SNPs / Chromosome
nChr = 10 # number of Chromosomes

# Number of Generations
ngen <- 9

# Founders parameters
meanG <- 40 # mean genetic value for the trait 
meanSexDiff <- 1 # mean difference between sexes

sigma2g <- 4.5^2 # genetic variance

# Heritability
h2 <- 0.4

# Error
sigma2e <- sigma2g*(1/h2 - 1)

#-----------------------------------------------------------------------
# Haplotypes
#-----------------------------------------------------------------------
founderPop <- runMacs(nInd = nInd,  nChr = nChr, segSites = nQtl + nSnp,
                     species = "CATTLE", ploidy = 2L)
#save(founderPop, file = "haplotypes.RData")
#load(file = "haplotypes.RData")
#-----------------------------------------------------------------------
# Parameters
#-----------------------------------------------------------------------
SP <- SimParam$new(founderPop)

#-----------------------------------------------------------------------
# Quantitative trait (A = only have additive effects)
#-----------------------------------------------------------------------
SP$addTraitA(nQtlPerChr = nQtl,  mean = meanG,  var = sigma2g)

#-----------------------------------------------------------------------
# How sexes are determined in the simulation
## Sistematic assignment to ensure same number of males and females
#-----------------------------------------------------------------------
SP$setSexes("yes_sys")
#-----------------------------------------------------------------------
# Add SNPs Chip
#-----------------------------------------------------------------------
SP$addSnpChip(nSnp)

Each year, we generate phenotype data, estimate breeding values, select \(10\%\) of the best males, mate them with all-female within a generation, and generate progeny, including their pedigree.

founders <- newPop(founderPop)

# Males
SelMales <- founders[founders@sex == "M"]
  
# Females
SelFemales = founders[founders@sex == "F"]

nDams <- nInd/2 # no selection
nSires <- 0.1 * nInd/2  # Top 10%

# Phenotypes
SelMales = setPheno(SelMales, varE = sigma2e)
SelMales@pheno = SelMales@pheno + meanSexDiff
SelFemales = setPheno(SelFemales, varE = sigma2e)
  
RecSys <- RecVar <- NULL
generation <- 0
year <- 0

RecSys <- RecSysMaleFemale(RecSys, SelMales)
RecSys <- RecSysMaleFemale(RecSys, SelFemales)
RecVar <- PullSumm(RecVar, SelMales)
RecVar <- PullSumm(RecVar, SelFemales)

for(generation in 1:ngen){
  Program ="Standard"
  cat("Evaluating", Program,": ", generation,"\n")
  year <- year + 1
  
  # Mate initial Sires and Dams
  pop_start <- randCross(pop=c(SelMales, SelFemales),
                         nCrosses = nDams, nProgeny = 2)
  
  # Select next generation based on genetic value
  # Males
  pop_males <- pop_start[pop_start@sex == "M"]
  
  # Females
  #SelFemales <- setPheno(SelFemales, h2 = h2)
  pop_females = pop_start[pop_start@sex == "F"]
  
  # Phenotype
  pop_males = setPheno(pop_males, varE = sigma2e)
  pop_males@pheno = pop_males@pheno + meanSexDiff
  
  pop_females = setPheno(pop_females, varE = sigma2e)
  
  # Dataset - trible
  RecSys <- RecSysMaleFemale(RecSys, pop_males)
  RecVar <- PullSumm(RecVar, pop_males)
  
  RecSys = RecSysMaleFemale(RecSys, pop_females)
  RecVar <- PullSumm(RecVar, pop_females)
  
  # pop_start
  pop_start@sex
  
  # Selection for the next mate
  SelMales <- selectInd(pop_males, nSires, use = "pheno", 
                          sex = "M")
  SelFemales <- pop_females

}

example <- RecSys
example$phenotype <- as.numeric(example$pheno)

example$sex <- as.factor(example$sex)
example <- example %>%
  mutate_if(is.character, as.numeric)

example <- example[, c(2:4,6,5,9)]

# Father and Mother - changing 0 to NA
example <- example %>%
  mutate(father = na_if(father, 0)) %>%
  mutate(mother = na_if(mother, 0))

# ordering pedigree from 1:n
nI <- nrow(example)
nI==length(unique(example$ind))

example$index <- seq(1:nI)
seq <- paste0(seq(1:nI),"A")
for(i in 1:nI){
  t_n <- example$ind[i]
  example$father[example$father == t_n] <- seq[i]
  example$mother[example$mother == t_n] <- seq[i]
  example$ind[example$ind == t_n] <- seq[i]
}

# removing the letter A
example$ind <- as.numeric(gsub("([0-9]+).*$", "\\1", example$ind))
example$father <- as.numeric(gsub("([0-9]+).*$", "\\1", example$father))
example$mother <- as.numeric(gsub("([0-9]+).*$", "\\1", example$mother))

example <- example[,c(1:6)]
example$year <- as.factor(example$year)

example %>%
  saveRDS(file = paste0("animal_sim",ngen+1,".RDS"))

save(RecSys, RecVar, file = paste0("animal_simRecs",ngen+1,".RData"))

The dataset and its structure can be seen bellow:

example2 <- readRDS(file = "./datasets/animal_sim10.RDS")
example2
str(example2)
## tibble [10,000 × 6] (S3: tbl_df/tbl/data.frame)
##  $ ind      : num [1:10000] 1 2 3 4 5 6 7 8 9 10 ...
##  $ father   : num [1:10000] NA NA NA NA NA NA NA NA NA NA ...
##  $ mother   : num [1:10000] NA NA NA NA NA NA NA NA NA NA ...
##  $ year     : Factor w/ 10 levels "0","1","2","3",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ sex      : Factor w/ 2 levels "F","M": 2 2 2 2 2 2 2 2 2 2 ...
##  $ phenotype: num [1:10000] 37.7 35.8 28.4 33.6 32.9 ...
Directed acyclic graph of animal model with $K$ individuals and $I$ phenotypic records considering fixed effects of Sex ($\mbox{Sex}_{j}$) and Year ($\mbox{Year}_{l}$), and a representation of Mendelian sampling terms ($w_k$) and error term ($e_i$), where $\sigma_{a}^{2}$ is the genetic variance, $a_{f(k)}$ and $a_{m(k)}$ are parents' breeding value, $\mu_{i}$ is the linear predictor, and $\sigma_{e}^{2}$ variance of the error term. We are ommiting the intercept term in the linear predictor.

Figure 10.1: Directed acyclic graph of animal model with \(K\) individuals and \(I\) phenotypic records considering fixed effects of Sex (\(\mbox{Sex}_{j}\)) and Year (\(\mbox{Year}_{l}\)), and a representation of Mendelian sampling terms (\(w_k\)) and error term (\(e_i\)), where \(\sigma_{a}^{2}\) is the genetic variance, \(a_{f(k)}\) and \(a_{m(k)}\) are parents’ breeding value, \(\mu_{i}\) is the linear predictor, and \(\sigma_{e}^{2}\) variance of the error term. We are ommiting the intercept term in the linear predictor.

10.3 Computing Cost

We are also running three chains with chain length, burn-in, and thinning rate arbitrarily set per chain to 5,000, 1,000, and 10, respectively, which give us 3,000 simulation draws in total. The results for some models are presented below, and further discussion will be addressed at the end of the section.

  1. Basic Animal model
# Variance unknown
AnimalModelBUGSVar
## function()
## {
##   ## Priors - precision parameters
##   tau2e ~ dgamma(0.001, 0.001)
##   tau2a ~ dgamma(0.001, 0.001)
##   sigma2e <- pow(tau2e, -1)
##   sigma2a <- pow(tau2a, -1)
## 
##   ## Additive genetic values
##   for(k in 1:nI) {
##     a[id[k]] ~ dnorm(pa[id[k]], Xtau2a[id[k]])
##     pa[id[k]] <- 0.5 * (a[fid[k]] + a[mid[k]])
##     Xtau2a[id[k]] <- winv[id[k]] * tau2a
##   }
##   a[nU] <- 0 # NULL (zero) holder
## 
##   ## Fixed Effects
##   for(j in 1:nB) { b[j] ~ dnorm(0, 1.0E-6) }
## 
##   ## Phenotypes
##   for(i in 1:nY) {
##     y[i] ~ dnorm(mu[i], tau2e)
##     mu[i] <- inprod(x[i, ], b[]) + a[idy[i]]
##   }
## }
  • Without accounting for inbreeding:
dataNoInb <- do.call('rbind', 
                     lapply(c("./Results/Large_Example/ESSNoInb.RDS",
                              "./Results/Large_Example/ESSNoInbVar.RDS"), readRDS))
dataNoInb$model <- as.factor(dataNoInb$model)

g1 <- dataNoInb %>%
  ggplot(aes(y= cost_mean, x= Generation)) +
  geom_point(aes(colour = par)) +
  geom_line(aes(linetype = par, colour = par)) +
  xlab("Generation") +
  ylab("Mean computing cost per sample (seconds)")+
  ylim(0, max(dataNoInb$cost_max)) +
  facet_wrap(~model)+
  theme_bw() +
  theme(legend.position = "none") 

g2 <- dataNoInb %>%
  ggplot(aes(y= cost_max, x= Generation)) +
  geom_point(aes(colour = par)) +
  geom_line(aes(linetype = par, colour = par)) +
  xlab("Generation") +    
  ylim(0, max(dataNoInb$cost_max)) +
  ylab("Maximum computing cost per sample (seconds)")+
  facet_wrap(~model)+
  theme_bw()
g1+g2
Mean and Maximum computing cost per sample (in seconds) for additive breeding values ($a$), fixed effect terms ($b$), additive variance ($\sigma^2_a$), and error term variance ($\sigma_{e}^2$) without accounting for inbreeding

Figure 10.2: Mean and Maximum computing cost per sample (in seconds) for additive breeding values (\(a\)), fixed effect terms (\(b\)), additive variance (\(\sigma^2_a\)), and error term variance (\(\sigma_{e}^2\)) without accounting for inbreeding

  • Accounting for inbreeding:
dataInb <- do.call('rbind', 
                     lapply(c("./Results/Large_Example/ESSInb.RDS",
                              "./Results/Large_Example/ESSInbVar.RDS"), readRDS))
dataInb$model <- as.factor(dataInb$model)

g1 <- dataInb %>%
  ggplot(aes(y= cost_mean, x= Generation)) +
  geom_point(aes(colour = par)) +
  geom_line(aes(linetype = par, colour = par)) +
  xlab("Generation") +
  ylab("Mean computing cost per sample (seconds)")+
  ylim(0, max(dataInb$cost_max)) +
  facet_wrap(~model)+
  theme_bw() +
  theme(legend.position = "none") 

g2 <- dataInb %>%
  ggplot(aes(y= cost_max, x= Generation)) +
  geom_point(aes(colour = par)) +
  geom_line(aes(linetype = par, colour = par)) +
  xlab("Generation") +    
  ylim(0, max(dataInb$cost_max)) +
  ylab("Maximum computing cost per sample (seconds)")+
  facet_wrap(~model)+
  theme_bw()
g1+g2
Mean and Maximum computing cost per sample (in seconds) for additive breeding values ($a$), fixed effect terms ($b$), additive variance ($\sigma^2_a$), and error term variance ($\sigma_{e}^2$) accounting for inbreeding

Figure 10.3: Mean and Maximum computing cost per sample (in seconds) for additive breeding values (\(a\)), fixed effect terms (\(b\)), additive variance (\(\sigma^2_a\)), and error term variance (\(\sigma_{e}^2\)) accounting for inbreeding

  1. Animal model using Cholesky decompostion of the NRM
# VAriance unknown
AnimalModelBUGSCholVar
## function()
## {
##   ## Variance priors
##   tau2e ~ dgamma(0.001, 0.001)
##   tau2a ~ dgamma(0.001, 0.001)
##   sigma2e <- 1 / tau2e
##   sigma2a <- 1 / tau2a
## 
##   ## Additive genetic values
##   for(k in 1:nI) {
##     u[k] ~ dnorm(0, tau2a)
##     a[id[k]] <- u[k] * wsqr[id[k]] + 0.5 * (a[fid[k]] + a[mid[k]])
##   }
##   a[nU] <- 0 # NULL (zero) holder
## 
##   ## Location priors
##   for(j in 1:nB) { b[j] ~ dnorm(0, 1.0E-6) }
## 
##   ## Phenotypes
##   for(i in 1:nY) {
##     y[i] ~ dnorm(mu[i], tau2e)
##     mu[i] <- inprod(x[i, ], b[]) + a[idy[i]]
##   }
## }
dataChol <- do.call('rbind', 
                     lapply(c("./Results/Large_Example/ESSChol.RDS",
                              "./Results/Large_Example/ESSCholVar.RDS"), readRDS))
dataChol$model <- as.factor(dataChol$model)

g1c <- dataChol %>%
  ggplot(aes(y= cost_mean, x= Generation)) +
  geom_point(aes(colour = par)) +
  geom_line(aes(linetype = par, colour = par)) +
  xlab("Generation") +
  ylab("Mean computing cost per sample (seconds)")+
  ylim(0, max(dataChol$cost_max)) +
  facet_wrap(~model)+
  theme_bw() +
  theme(legend.position = "none") 

g2c <- dataChol %>%
  ggplot(aes(y= cost_max, x= Generation)) +
  geom_point(aes(colour = par)) +
  geom_line(aes(linetype = par, colour = par)) +
  xlab("Generation") +    
  ylim(0, max(dataChol$cost_max)) +
  ylab("Maximum computing cost per sample (seconds)")+
  facet_wrap(~model)+
  theme_bw()
g1c+g2c
Mean and Maximum computing cost per sample (in seconds) for additive breeding values ($a$), fixed effect terms ($b$), additive variance ($\sigma^2_a$), and error term variance ($\sigma_{e}^2$) accounting for inbreeding and using Cholesky decompostion of the NRM

Figure 10.4: Mean and Maximum computing cost per sample (in seconds) for additive breeding values (\(a\)), fixed effect terms (\(b\)), additive variance (\(\sigma^2_a\)), and error term variance (\(\sigma_{e}^2\)) accounting for inbreeding and using Cholesky decompostion of the NRM

  1. Animal model variant 1 using Cholesky decompostion of the NRM
# Variance unknown
AnimalModelBUGSVariant1Var
## function()
## {
##   ## Priors - precision parameters
##   tau2e ~ dgamma(0.001, 0.001)
##   tau2a ~ dgamma(0.001, 0.001)
##   sigma2e <- pow(tau2e, -1)
##   sigma2a <- pow(tau2a, -1)
## 
##   ## Transformed additive genetic values - variant 1
##   for(k in 1:nI) {
##     u[k] ~ dnorm(0, tau2a)
##     a[id[k]] <- u[k] * wsqr[id[k]] + 
##       0.5 * (a[fid[k]] + a[mid[k]])
##     }
##   a[nU] <- 0 # NULL (zero) holder
## 
##   ## Location priors
##   for(j in 1:nB) {
##     b[j] ~ dnorm(0, 1.0E-6)
##   }
## 
##   ## Phenotypes
##   for(i in 1:nY) {
##     y[i] ~ dnorm(mu[i], tau2e)
##     mu[i] <- inprod(x[i, ], b[]) + a[idy[i]]
##   }
## }
dataCholVar1 <- do.call('rbind', 
                     lapply(c("./Results/Large_Example/ESSCholVar1.RDS",
                              "./Results/Large_Example/ESSCholVar1Var.RDS"), readRDS))
dataCholVar1$model <- as.factor(dataCholVar1$model)

g1 <- dataCholVar1 %>%
  ggplot(aes(y= cost_mean, x= Generation)) +
  geom_point(aes(colour = par)) +
  geom_line(aes(linetype = par, colour = par)) +
  xlab("Generation") +
  ylab("Mean computing cost per sample (seconds)")+
  ylim(0, max(dataChol$cost_max)) +
  facet_wrap(~model)+
  theme_bw() +
  theme(legend.position = "none") 

g2 <- dataCholVar1 %>%
  ggplot(aes(y= cost_max, x= Generation)) +
  geom_point(aes(colour = par)) +
  geom_line(aes(linetype = par, colour = par)) +
  xlab("Generation") +    
  ylim(0, max(dataChol$cost_max)) +
  ylab("Maximum computing cost per sample (seconds)")+
  facet_wrap(~model)+
  theme_bw()
g1+g2
Mean and Maximum computing cost per sample (in seconds) for additive breeding values ($a$), fixed effect terms ($b$), additive variance ($\sigma^2_a$), and error term variance ($\sigma_{e}^2$) accounting for inbreeding and using Cholesky decompostion of the NRM (variant 1)

Figure 10.5: Mean and Maximum computing cost per sample (in seconds) for additive breeding values (\(a\)), fixed effect terms (\(b\)), additive variance (\(\sigma^2_a\)), and error term variance (\(\sigma_{e}^2\)) accounting for inbreeding and using Cholesky decompostion of the NRM (variant 1)

  1. Animal model variant 2 using Cholesky decompostion of the NRM
# Variance unknown
AnimalModelBUGSVariant2Var
## function()
## {
##   ## Priors - precision parameters
##   tau2e ~ dgamma(0.001, 0.001)
##   tau2a ~ dgamma(0.001, 0.001)
##   sigma2e <- pow(tau2e, -1)
##   sigma2a <- pow(tau2a, -1)
## 
##   ## Transformed additive genetic values - variant 2
##   sigmaa <- pow(tau2a, -2) 
##   for(k in 1:nI) { 
##     u[k] ~ dnorm(0, 1)
##     a[id[k]] <- u[k] * wsqr[id[k]] * sigmaa + 
##       0.5 * (a[fid[k]] + a[mid[k]])
##     }
##   a[nU] <- 0 # NULL (zero) holder
## 
##   ## Location priors
##   for(j in 1:nB) {
##     b[j] ~ dnorm(0, 1.0E-6)
##   }
## 
##   ## Phenotypes
##   for(i in 1:nY) {
##     y[i] ~ dnorm(mu[i], tau2e)
##     mu[i] <- inprod(x[i, ], b[]) + a[idy[i]]
##   }
## }
 dataCholVar2 <- do.call('rbind', 
                     lapply(c("./Results/Large_Example/ESSCholVar2.RDS",
                              "./Results/Large_Example/ESSCholVar2Var.RDS"), readRDS))
dataCholVar2$model <- as.factor(dataCholVar2$model)

g1 <- dataCholVar2 %>%
  ggplot(aes(y= cost_mean, x= Generation)) +
  geom_point(aes(colour = par)) +
  geom_line(aes(linetype = par, colour = par)) +
  xlab("Generation") +
  ylab("Mean computing cost per sample (seconds)") +
#  ylim(0, max(dataCholVar2$cost_max)) +
  facet_wrap(~model, scales = "free") +
  theme_bw() +
  theme(legend.position = "none") 

g2 <- dataCholVar2 %>%
  ggplot(aes(y= cost_max, x= Generation)) +
  geom_point(aes(colour = par)) +
  geom_line(aes(linetype = par, colour = par)) +
  xlab("Generation") +    
#  ylim(0, max(dataCholVar2$cost_max)) +
  ylab("Maximum computing cost per sample (seconds)") +
  facet_wrap(~model, scales = "free") +
  theme_bw()
g1+g2
Mean and Maximum computing cost per sample (in seconds) for additive breeding values ($a$), fixed effect terms ($b$), additive variance ($\sigma^2_a$), and error term variance ($\sigma_{e}^2$) accounting for inbreeding and using Cholesky decompostion of the NRM (variant 2)

Figure 10.6: Mean and Maximum computing cost per sample (in seconds) for additive breeding values (\(a\)), fixed effect terms (\(b\)), additive variance (\(\sigma^2_a\)), and error term variance (\(\sigma_{e}^2\)) accounting for inbreeding and using Cholesky decompostion of the NRM (variant 2)

Here we show that ‘prior’ Cholesky decomposition can solve a system of equations as an alternative to direct matrix inversion. However, it has an additional cost per effective sample with a maximum cost per sample of approximate \(82\) seconds against \(12.5\) seconds observed to direct matrix inversion. Cholesky decomposition is computationally costlier as the size of matrix grows due to its floating-point math operations per \([N \times N]\) matrix, which is generally estimated as \(\mbox{FLOPS}=4\frac{n^3}{3} = O(n^3)\) (Chari and Salon 2000). On the other hand, this approach can gain the inverse matrix more numeric efficiently than from the whole matrix as we only need to invert \(\boldsymbol{L}\), not \(\boldsymbol{A}=\boldsymbol{L}\boldsymbol{L}^{T}\) (Alfriend et al. 2010). Cholesky decomposition is also helpful for efficient numerical solutions. Its computational rate and efficiency highly depend on implementation details and the computing device’s architecture details used to run the analysis (Chari and Salon 2000).

We also observed that the animal model variant 2 using Cholesky decomposition of the NRM and Gibbs sampling algorithm has the highest mean and maximum cost per sample. Thus we recommend avoiding such parameterization to run MCMC with Gibbs sampling sampler. Our results showed the maximum cost per effective sample was 4400 seconds and 249 seconds for unknown and known variance components, respectively. Therefore, the original implementation of Damgaard ((Damgaard 2007b); variant 2 in function Variantes 2) is the most time-consuming approach, and the implementation with the transformed additive genetic values constantly produced chains with a considerably lower effective sample size when using Gibbs sampling sampler. Furthermore, Shariati and Sorensen (2009b) performed extensive tests of various McMC samplers for animal model and concluded that none of the samplers was the best in all scenarios, but transformed and blocked samplers could be beneficial when there are large posterior correlations between the parameters.

10.4 NIMBLE

To exemplify the cost per effective sample based on the NIMBLE package, we choose the animal model using Cholesky decomposition of the NRM and the Gibbs sampling algorithm. Figure 10.7 shows NIMBLE has a lower mean and maximum cost per sample than JAGS. Therefore, we recommend using NIMBLE instead of JAGS to run a more complex DAG model, as shown here. Furthermore, the C++ compiler of NIMBLE is doing an excellent job to speed up the analysis.

data2 <- readRDS("./Results/Large_Example/ESSCholNimble.RDS")
data2$model <- "Known Variance"
saveRDS(data2, file="./Results/Large_Example/ESSCholNimble.RDS")

dataCholNimble <- do.call('rbind', 
                     lapply(c("./Results/Large_Example/ESSCholNimble.RDS",
                              "./Results/Large_Example/ESSCholVarNimble.RDS"), readRDS))
dataCholNimble$model <- as.factor(dataCholNimble$model)
dataCholNimble$par <- as.factor(dataCholNimble$par)

g1 <- dataCholNimble %>%
  ggplot(aes(y= cost_mean, x= Generation)) +
  geom_point(aes(colour = par)) +
  geom_line(aes(linetype = par, colour = par)) +
  xlab("Generation") +
  ylab("Mean computing cost per sample (seconds)")+
  ylim(0, max(dataChol$cost_max)) +
  facet_wrap(~model)+
  theme_bw() +
  ggtitle("NIMBLE")+
  theme(legend.position = "none") 

g2 <- dataCholNimble %>%
  ggplot(aes(y= cost_max, x= Generation)) +
  geom_point(aes(colour = par)) +
  geom_line(aes(linetype = par, colour = par)) +
  xlab("Generation") +    
  ylim(0, max(dataChol$cost_max)) +
  ylab("Maximum computing cost per sample (seconds)")+
  facet_wrap(~model)+
  ggtitle("NIMBLE")+
  theme_bw()

p1 <- g1c + ggtitle("JAGS")
p2 <- g2c + ggtitle("JAGS")
(p1+p2)/(g1+g2)
Mean and Maximum computing cost per sample (in seconds) for JAGS and NIMBLE considering the additive breeding values ($a$), fixed effect terms ($b$), additive variance ($\sigma^2_a$), and error term variance ($\sigma_{e}^2$) accounting for inbreeding and using Cholesky decompostion of the NRM

Figure 10.7: Mean and Maximum computing cost per sample (in seconds) for JAGS and NIMBLE considering the additive breeding values (\(a\)), fixed effect terms (\(b\)), additive variance (\(\sigma^2_a\)), and error term variance (\(\sigma_{e}^2\)) accounting for inbreeding and using Cholesky decompostion of the NRM

References

Alfriend, Kyle T., Srinivas R. Vadali, Pini Gurfil, Jonathan P. How, and Louis S. Breger. 2010. “Chapter 3 - The Basics of Analytical Mechanics, Optimization, Control and Estimation.” In Spacecraft Formation Flying, edited by Kyle T. Alfriend, Srinivas R. Vadali, Pini Gurfil, Jonathan P. How, and Louis S. Breger, 39–57. Oxford: Butterworth-Heinemann. https://doi.org/10.1016/B978-0-7506-8533-7.00208-6.
Bates, Douglas, and Martin Maechler. 2015. MatrixModels: Modelling with Sparse And Dense Matrices.” https://CRAN.R-project.org/package=MatrixModels.
Bates, Douglas, and Ana Ines Vazquez. 2014. “Pedigreemm: Pedigree-Based Mixed-Effects Models.” https://CRAN.R-project.org/package=pedigreemm.
Chari, M. V. K., and S. J. Salon. 2000. “11 - SOLUTION OF EQUATIONS.” In Numerical Methods in Electromagnetism, edited by M. V. K. Chari and S. J. Salon, 591–706. San Diego: Academic Press. https://doi.org/10.1016/B978-012615760-4/50012-2.
Damgaard, L. H. 2007b. “Technical Note: How to Use Winbugs to Draw Inferences in Animal Models.” Journal of Animal Science 85 (6): 1363–68. https://doi.org/10.2527/jas.2006-543.
———. 2007a. “Technical Note: How to Use Winbugs to Draw Inferences in Animal Models.” Journal of Animal Science 85 (6): 1363–68. https://doi.org/10.2527/jas.2006-543.
Denwood, Matthew J. 2016. “Runjags: An R Package Providing Interface Utilities, Model Templates, Parallel Computing Methods and Additional Distributions for MCMC Models in JAGS.” Journal of Statistical Software, 25.
Gaynor, R Chris, Gregor Gorjanc, and John M Hickey. 2020. AlphaSimR: An R-Package for Breeding Program Simulations.” bioRxiv, 21. https://doi.org/https://doi.org/10.1101/2020.08.10.245167.
Gelman, A. 2006. “Prior Distributions for Variance Parameters in Hierarchical Models.” Bayesian Analysis 1 (3): 515–33. http://ba.stat.cmu.edu/journal/2006/vol01/issue03/gelman.pdf.
Gelman, A., J. B. Carlin, H. S. Stern, D. B. Dunson, A Vehtari, and D. B. Rubin. 2014. Bayesian Data Analysis. 3rd ed. New York: Taylor & Francis.
Harville, D. A. 1986b. “Using Ordinary Least Squares Software to Compute Combined Intra-Interblock Estimates of Treatment Contrasts.” American Statistician 40 (3): 153–57. http://www.jstor.org/stable/2684879.
———. 1986a. “Using Ordinary Least Squares Software to Compute Combined Intra-Interblock Estimates of Treatment Contrasts.” American Statistician 40 (3): 153–57. http://www.jstor.org/stable/2684879.
Kass, R. E., B. P. Carlin, A. Gelman, and R. M. Neal. 1998. “Markov Chain Monte Carlo in Practice: A Roundtable Discussion.” American Statistician 52 (2): 93–100.
Lin, S. 1999. “Monte Carlo Bayesian Methods for Quantitative Traits.” Computational Statistics and Data Analysis 31 (1): 89–108. https://doi.org/10.1016/S0167-9473(99)00006-7.
Lourenco, D A L, S Tsuruta, B O Fragomeni, Y Masuda, I Aguilar, A Legarra, J K Bertrand, et al. 2015. “Genetic Evaluation Using Single-Step Genomic Best Linear Unbiased Predictor in American Angus.” Journal of Animal Science, 10. https://doi.org/10.2527/jas2014-8836.
Mehrabani-Yeganeh, H., J. P. Gipson, and L. R. Schaeffer. 2001. “Including Coefficients of Inbreeding in BLUP Evaluation and Its Effect on Response to Selection.” Journal of Animal Breeding and Genetics-Zeitschrift Fur Tierzuchtung Und Zuchtungsbiologie 117 (3): 145–51. https://doi.org/10.1046/j.1439-0388.2000.00241.x.
Pedersen, Thomas L. 2020. “Patchwork: The Composer of Plots.” https://CRAN.R-project.org/package=patchwork.
Plummer, M., N. Best, K. Cowles, and K. Vines. 2009. “The CODA R Package: Output Analysis and Diagnostics for McMC.” Manual. http://cran.r-project.org/package=coda.
Quaas, R. L., R. D. Anderson, and A. R. Gilmour. 1984b. BLUP School Handbook: Use of Mixed Models for Prediction and Estimation of (Co)variance Components.
———. 1984a. BLUP School Handbook: Use of Mixed Models for Prediction and Estimation of (Co)variance Components.
Ren, Kun, and Kenton Russell. 2016. “Formattable: CreateFormattableData Structures.” https://CRAN.R-project.org/package=formattable.
Shariati, M., and D. Sorensen. 2009b. “Efficiency of Alternative MCMC Strategies Illustrated Using the Reaction Norm Model.” Journal of Animal Breeding and Genetics-Zeitschrift Fur Tierzuchtung Und Zuchtungsbiologie 125 (3): 176–86. https://doi.org/10.1111/j.1439-0388.2008.00716.x.
———. 2009a. “Efficiency of Alternative MCMC Strategies Illustrated Using the Reaction Norm Model.” Journal of Animal Breeding and Genetics-Zeitschrift Fur Tierzuchtung Und Zuchtungsbiologie 125 (3): 176–86. https://doi.org/10.1111/j.1439-0388.2008.00716.x.
Slowikowski, Kamil. 2020. “Ggrepel: Automatically Position Non-Overlapping Text Labels with ’Ggplot2’.” https://CRAN.R-project.org/package=ggrepel.
Sorensen, D., R. Fernando, and D. Gianola. 2001. “Inferring the Trajectory of Genetic Variance in the Course of Artificial Selection.” Genetical Research 77 (1): 83–94. https://doi.org/10.1017/S0016672300004833.
Spiegelhalter, D., A. Thomas, N. Best, and D. Lunn. 2003. WinBUGS User Manual. http://www.mrc-bsu.cam.ac.uk/bugs/.
———. 2007. OpenBUGS User Manual. http://mathstat.helsinki.fi/openbugs/.
Valpine, Perry de, Christopher J. Paciorek, Daniel Turek, Nick Michaud, Cliff Anderson-Bergman, Fritz Obermeyer, Claudia Wehrhahn Cortes, Abel Rodrìguez, Duncan Temple Lang, and Sally Paganin. 2020. NIMBLE: MCMC, Particle Filtering, and Programmable Hierarchical Modeling.” https://cran.r-project.org/package=nimble.
Van Vleck, L. D. 1994. “Algorithms for Simulation of Animal Models with Multiple Traits and with Maternal and Non-Additive Genetic Effects.” Brazil. J. Genet. 12 (1): 53–57.
Waagepetersen, R., N Ibáñez-Escriche, and D. Sorensen. 2008. “A Comparison of Strategies for Markov Chain Monte Carlo Computation in Quantitative Genetics.” Genetics Selection Evolution 40 (1): 161–76. https://doi.org/10.1051/gse:2007042.
Waagepetersen, R., Ibáñez-Escriche N., and D. Sorensen. 2008. “A Comparison of Strategies for Markov Chain Monte Carlo Computation in Quantitative Genetics.” Genetics Selection Evolution 40 (1): 161–76. https://doi.org/10.1051/gse:2007042.
Waldmann, P. 2009b. “Easy and Flexible Bayesian Inference of Quantitative Genetic Parameters.” Evolution 63 (6): 1640–43. https://doi.org/10.1111/j.1558-5646.2009.00645.x.
———. 2009a. “Easy and Flexible Bayesian Inference of Quantitative Genetic Parameters.” Evolution 63 (6): 1640–43. https://doi.org/10.1111/j.1558-5646.2009.00645.x.
Waldmann, P., J. Hallander, F. Hoti, and M. J. Sillanpää. 2008b. “Efficient Markov Chain Monte Carlo Implementation of Bayesian Analysis of Additive and Dominance Genetic Variances in Noninbred Pedigrees.” Genetics 179 (2): 1101–12. https://doi.org/10.1534/genetics.107.084160.
———. 2008a. “Efficient Markov Chain Monte Carlo Implementation of Bayesian Analysis of Additive and Dominance Genetic Variances in Noninbred Pedigrees.” Genetics 179 (2): 1101–12. https://doi.org/10.1534/genetics.107.084160.
Wickham, Hadley. 2016. Ggplot2: Elegant Graphics for Data Analysis. New York: Springer-Verlag. https://ggplot2.tidyverse.org.
Wickham, Hadley, Romain François, Lionel Henry, and Kirill Muller. 2020. “Dplyr: A Grammar of Data Manipulation.” https://CRAN.R-project.org/package=dplyr.
Xie, Yihui. 2020. “Knitr: A General-Purpose Package for Dynamic Report Generation in R.”
Youngflesh, Casey. 2018. MCMCvis: Tools to Visualize, Manipulate, and Summarize MCMC Output.” Journal of Open Source Software 3 (24): 640. 10.21105/joss.00640.
Zhu, Hao. 2020. kableExtra: Construct Complex Table with ’Kable’ and Pipe Syntax.” https://github.com/haozhu233/kableExtra.